azizilab / starfysh

Spatial Transcriptomic Analysis using Reference-Free auxiliarY deep generative modeling and Shared Histology
BSD 3-Clause "New" or "Revised" License
99 stars 12 forks source link

Normalization of Images #43

Closed psl-schaefer closed 3 months ago

psl-schaefer commented 4 months ago

In the paper it says that the "original H&E images are first normalized to $[0,1]$ per channel". But I don't see that in the code if hchannel=False (or rather the normalization is commented out). Referring to the code here:

https://github.com/azizilab/starfysh/blob/7407267515d56a7dc96672c764a40635fae581d6/starfysh/utils.py#L725

Did you decide to not scale the image channels, or is this happening somewhere else in the code?

YinuoJin commented 4 months ago

Hi schae211,

Thanks for the inquiry! For the dataset we processed the spatial information was stored independent of the .h5ad expression file, so we appended the image with normalization here: https://github.com/azizilab/starfysh/blob/7407267515d56a7dc96672c764a40635fae581d6/starfysh/utils.py#L283. We will uncomment your highlighted line for consistency with dataset with appended spatial information within h5 files.

psl-schaefer commented 4 months ago

Okay, thanks a lot for the clarification!

psl-schaefer commented 4 months ago

A minor followup: The uncommented code does not normalize each channel separately, so one would actually need something along those lines

n_channels = adata_image.shape[-1]
for channel_idx in range(n_channels):
    adata_image[:, :, channel_idx] = \
        (adata_image[:, :, channel_idx]-adata_image[:, :, channel_idx].min())/ \
        (adata_image[:, :, channel_idx].max()-adata_image[:, :, channel_idx].min())
psl-schaefer commented 4 months ago

And maybe a slightly unrelated question, when would you recommend to use the apply the binary color deconvolution to extract hematoxylin channel? (i.e. setting hchannel=True)?

YinuoJin commented 4 months ago

Yes, we found that through binary color deconvolution (hematoxylin channel), the intensity is positively correlated to the cell density. For high-resolution image the histology integration should be helpful, but if your image is visually homogeneous or only has low-res version, running with PoE=False would be better.