labsyspharm / scimap

Spatial Single-Cell Analysis Toolkit
https://scimap.xyz/
MIT License
74 stars 26 forks source link

CODEX dataset with 200 images #119

Open andreevg04 opened 1 day ago

andreevg04 commented 1 day ago

I am trying to use scimap (the latest version) on a dataset with 200 images previously segmented with stardist. The intensity range is different as in the MCMICRO pipeline, but the segmentation result was stellar neverthelss. I am trying different tweeks and tried unsup. clustering (problem was that I got really strange clusters at resolutions varying from 0.5 to 3) meaning I get the CD45+CK+ which are probably the invading ones but cant seem to annotate them, so I used some preprocessing with and without normalizing like this

#Normalize intensities per cell making comparison between cells better
sc.pp.normalize_total(adata)
#log transform getting rid of extremes
sc.pp.log1p(adata)
# Store the normalized and log-transformed data in adata.raw
adata.raw = adata.copy()
# Scale across markers to have a robust expression per marker
sc.pp.scale(adata)

I tried it also without norm and without scaling. Then I trying using sm.pp.rescale using automatically picked gates but the problem there was that gate is set at .5 for each marker which might not be true for all markers and i got the same results wit too many falsely annotated cells. What would be your approach and segementing it through MCMICRO would it make it better? (I dont know what resolution should be used for MCMICRO when processing CODEX)? Thank you very much!

ajitjohnson commented 1 day ago

Hi @andreevg04,

I wouldn't recommend using the auto rescale function in scimap, as it is known to be imperfect, and I've been considering removing it. In our lab, we use manual visual gating. However, with a dataset of 200 images, this approach is understandably impractical.

We recently developed cSPOT specifically for such purposes, though it’s still in early testing.

I doubt that using MICMICRO will resolve the issue, as it seems more likely to be a segmentation artifact. I’d suggest visually inspecting the CD45+CK+ cells overlaid on the image along with the segmentation mask, using the image_viewer function. This should help confirm if these are cases of signal spillover and/or segmentation artifacts.

andreevg04 commented 1 day ago

I did inspect them and of course most of them were lymphocytes but some were tumor cells in close proximity to the lymphocyte. So i am kind of lost what to do. Do you think the scanpy preprocessing make sense and how to annotate each cluster? I tried different strategies and I always hit the rock. I could send you a matrixplot of subset of markers I use to just annotate the main cells and its still not clear which is which. Also the matrixplot I tried it with the normalized log1p data and with the normalized, log1p scaled data. There are sligh differences but its still unclear. I migh try cSPOT, will read the tutorial.

ajitjohnson commented 1 day ago

Interesting. Could it be that the high background levels in CK are contributing to this issue?

andreevg04 commented 1 day ago

Its just the expression and the signal generated from Stardist is indeed really high. When I look how the background looks it is actually clean, jsut the cells are extremely positive