Edge artifacts when using partitions

dimokaramanlis commented 8 months ago

Hi EXTRACT team,

I have been trying to get your algorithm to work for microendoscope movies from the mouse brain. Cell detection works quite well, and I am very happy with the results. Thanks for the amazing software.

I have been following the recipe of downsample_and_run_extract.m for my movie, and have first downsampled (in time) the movie and then manually sorted some cells. I then ran EXTRACT with the full movie while providing only sorted spatial filters from S_init, 280 in total. Because the movie is quite long, I split it into nine partitions (3 x 3) so that my RAM can handle it. To my surprise, the final results now have a larger amount of filters (327), and it seems there are multiple filters for cells that fall in between partitions. This can be illustrated below, where I have plotted the average of all spatial filters before and after the final step.

extract_question

Do you know how to remove edge artifacts while using partitions? Is there a way besides using more RAM?

Thank you in advance for the help.

fatihdinc commented 8 months ago

Hi Dimokratis, Thank you so much for bringing this up! This is a known bug that has been bothering me for a while and I think there is a very straightforward way to fix this internally; it has been going unnoticed for a while. I have been thinking about this, and will update the current version with the fix in the next release. For now, let me first explain why it happens and next show how to fix it externally.

It happens because partitions have overlaps and cells at the boundary get picked into multiple partitions at once. Since they will likely be "cutoff" due to partial overlaps, the inherent duplicate removal inside EXTRACT is sometimes not able to remove them properly. In this scenario, here is an external fix:

Step 1) Run final robust regression by setting config.remove_duplicate_cells = 0. We do not want internal duplicate removal, we will do so externally.

Step 2) The output will contain duplicate cells. We can simply use the match_sets.m function inside EXTRACT to match cells between S_init and the new output file (S_new = reshape(output.spatial_weights,nx*ny,[])). A cutoff like 0.95 should be able to recover the correct matches.

Step 3) Pick the fully matched cells from the new output.

This should take no more than 2 lines of code and virtually zero additional runtime until I release a new version, which is planned for the end of the summer. Please let me know if you have any additional concerns and thank you for bringing this up!

dimokaramanlis commented 8 months ago

Hi Fatih,

thank you very much for the quick reply. Makes complete sense, I re-run the final extraction with no duplicate removal and I could get back all the cells found previously! Didn't use a cutoff, just found the cell that has the best spatial filter match to the initial output. My implementation is below, could be of help.

S_new = reshape(output.spatial_weights,[],size(output.spatial_weights,3)); cmat = corr(config.S_init, S_new); [allmax, imax] = max(cmat, [], 2); assert(all(allmax>0.95)) output.spatial_weights = output.spatial_weights(:,:,imax); output.temporal_weights = output.temporal_weights(:,imax);

schnitzer-lab / EXTRACT-public

Edge artifacts when using partitions #48