use of ScanImage offset in tileLoad can cause problems

raacampbell commented 5 years ago

CPBT SIBT produced tiffs reports the wrong offset from ScanImage. The tile stats are correct, however, so ought to use those. Look into what is going wrong.

The problem produces bright tile edges. tileEdges

I don't recall why I went with the ScanImage offset value.

raacampbell commented 5 years ago

This seems to happen only for the test data where average frames are not saved together but separately.

raacampbell commented 4 years ago

Happens again. With a sample having no background tiles. The issue is that the ScanImage offset may be wrong. The INI file says we use the tile stats file, but in fact we don't!

Can fix the above issue by making sure the average tile has no negative numbers. This is now implemented in a slightly hackish way: 780f797b29d791000ea1eaf96583b3227cddd270

ablot commented 2 years ago

I also have negative values in my grand average on cricksaw. The hackish fix helps but doesn't entierly solve the issue. I can avoid it somehow by cropping more before stitching but I'm surprised that I get negative values once averaging thousands of frames.

Is this related to ScanImage offset? It seems that the average is done using tileStats.offsetMean (L87 of calAverageMatFiles.m) which should be the offset from the GMM fit.

Can it be because I get plenty of Empty tile threshold not trustworthy warning and the average is made on frames with no brain?

raacampbell commented 2 years ago

Does it look like the image above? How negative are your values? The code that does this tile threshold and offset correction is a bit dodgy. It was worked acceptably for some time so I've not done much with it in ages, but it is creaking at the seams.

The Empty tile threshold not trustworthy happens a lot for us to.

ablot commented 2 years ago

It's a lot less bad and just on one corner: Example full frame: raw_image_91_3_7_5_3_full_view

Just the bad corner because it's hard to see on the full frame: raw_image_91_3_7_5_3_corner

It's not too bad but just bright/big enough to mess with cell detection. I can fix it for now by increasing the cropping but I was very suprised to see that my grand average goes negative. The minimal value is about -10.

ablot commented 2 years ago

We switched to scanimage offset before the GMM was properly implemented. It used to be calculated on each call of tileLoad IIRC and produced issue #57. Using a constant offset, read from the first acquired scanimage stack, was fixing the issue.

Maybe now that the GMM is working, we could stop using scanimage metadata. Maybe it would be prudent to have at least an option to keep the offset constant for the whole dataset (in case GMM estimation fails on one section for some reason).

raacampbell commented 2 years ago

The GMM estimation is prone to failure; especially if it's run on each section. Which issue to do you mean? The link for 57, above, seems to be for something else.

Maybe now is a good time to look into this stuff again using your dataset. Maybe revisit how the empty tiles are found? Can you easily get me a subset of your data to play with?

I'm disturbed it's messing up the cell detection. Certainly this problem has cropped up elsewhere but I bet nobody noticed. Is the problem with the detection severe?

ablot commented 2 years ago

Oh sorry it's bakingtray issue 57: https://github.com/SainsburyWellcomeCentre/BakingTray/issues/57

I'll get a hard drive and bring the data to you next week. I can also try to help if we come up with a plan.

It's not a terrible issue (especially since I can crop it away). Some fake cells are detected when the corner with noise is around a brighter spot. This might also be because the network hasn't been retrained yet on cricksaw data (and has a lot too many false detection for now).

ablot commented 2 years ago

I tried to improve the offset estimation.

SI offset is unreliable.
The GMM fit on the 10% dimest tiles is very very slow
The GMM fit tend to overestimate the offset or fail to converge if we give it only 2 component
The GMM fit on the average of the 10% dimest tiles is an overestimate
The GMM fit on the dimest tile is better but can vary from section to section

Here are my example histogram for one of my channels (with the arrowhead being the mean): offset_BRYC64 2h_channel1

So I decided fit a GMM with 3 components (5 was the same, 2 seemed to estimate it a bit higher) to the dimest tile of each section and then take the median of these offsets as my true offset. That worked for me:

(green original image, magenta new offset subtraction, it's all the same except for the crazy noise that is a lot reduced) fixed_offset

This is in PR #196

raacampbell commented 1 year ago

I have seen cases where the GMM on the dimmest tile returns an offset that is too negative and this causes obvious tiling artifacts. Proposed solution: provide option to save mode of histogram of dimmest tile.

EDIT: it is possible this is a non-issue, but nonetheless 207be72fa3faf20512b969cb6974511b8d691107 clarifies what these offsets do and 861df9795e3ef75b71c52a85c08a872d062bcc06 adds a maybe useless offset type to employ the mean of the dimmest tile not the min.

raacampbell commented 8 months ago

Ha! An issue I just discovered is that the offset file is not updated when collateAverageImages runs. So the grand average used for illumination correction gets subtracted from it a slightly incorrect offset value. The easiest solution is to delete the .mat files that cache the offset values whenever collateAverageImages is run. This will force the data in these files to get re-generated next time they are requested for a switching operation.
Indeed this fixes issues I see in a test dataset and I can't see how the fix could result in unintended negative consequences. 16db7d3aa8950cee6fc68c397ac276e51cce1fe7

SWC-Advanced-Microscopy / StitchIt

use of ScanImage offset in tileLoad can cause problems #145