Dana-Farber-AIOS / pathml

Tools for computational pathology
https://pathml.org
GNU General Public License v2.0
392 stars 84 forks source link

Question about the StainNormalizationHE transform #329

Open ckv1110 opened 2 years ago

ckv1110 commented 2 years ago

Hi, I've used the PathML StainNormalization transform to varying success in my pipeline. I used the macenko method to normalize images at 1000 x 1000 px tile size. However, as I reduced the tile size to anything below 1000 x 1000 px, it started displaying LinAlgError('Eigenvalues did not converge') errors. Here are screenshots of the errors that I encountered: Distributed = True Screenshot from 2022-08-22 15-30-53 Distributed = False Screenshot from 2022-08-22 15-35-39 I'm aware that there are multiple variables to play with, but the 2nd screenshot indicates that it is an issue with the optical_density_threshold variable. I have thus started to play with OD threshold values below 0.15 to see how it would affect the error. So far, the errors come up more infrequently at 0.1 OD threshold, but I am open to suggestions for other variables to tweak. I also noticed the fit_to_reference function in PathML, which could alter the stain_matrix_target and max_c_target. Could this be the more efficient solution as it would set a reference stain_matrix_target and max_c_target closer to my other HnE slides? Thank you very much for your time.

Chun

jacob-rosenthal commented 2 years ago

These approaches use numerical methods to find the vectors for the stains. This error means that the numerical methods aren't converging. This could happen for any number of reasons. One scenario for example is applying the stain normalization transform to a tile that only contains white space/background - in this case, if all the pixels are nearly identical, there might not be enough variance to fit the stain vectors to. Perhaps that could also explain why you only see this with small tile sizes, e.g. if the tissue regions are >1000px away from the edge of the image, then you will get some pure background tiles if using tile size < 1000px.

fit_to_reference() can be used to set what's the "normal" H&E stain, i.e. the stain matrix used for normalization; otherwise the default will be used

By the way, in the future it's preferred to paste the code/output itself instead of using screenshots, so that they can be copy/pasted and the contents are searchable for other people running into the same errors in the future. You can surround it in ``` to create a code block in github markdown. Thanks