labsyspharm / scimap

Spatial Single-Cell Analysis Toolkit
https://scimap.xyz/
MIT License
71 stars 24 forks source link

sm.pp.rescale and sm.tl.phenotype for phenotyping not adhering to user-dfined gates #106

Closed LukasHats closed 2 months ago

LukasHats commented 2 months ago

Dear @ajitjohnson

thanks for providing this awesome package. I am currently struggling with scimap assigning cells to a certain phenotype, although they clearly do not pass the manually given threshold. Here the problem in more detail:

I have a dataset from imaging mass cytometry, I adjusted the table to fit into mcmicro output and used:

adata = sm.pp.mcmicro_to_scimap(export,split="area", CellId="CellID", drop_markers = ["HistoneH3", "CD98", " 1", " 2", " 3", " 4", " 5", " 6", "191Ir", "193Ir"])

to load the data.

I started phenotyping using the following approach:

  marker_of_interest = 'CD138'
  sm.pl.gate_finder (image_path, adata, marker_of_interest,
                     from_gate = 0, to_gate = 5, increment = 0.1, 
                     markers=['HistoneH3', 'CD3', 'CD38', "CD68", "MPO"], point_size=3, 
                     seg_mask = mask_path)

Did this for all markers and stored the gates in a csv, then used:

  manual_gate = pd.read_csv('manual_gates_scimap.csv')
  adata = sm.pp.rescale (adata, gate=manual_gate)
  phenotype = pd.read_csv('phenotyping_scheme_scimap.csv')
  adata = sm.tl.phenotype_cells (adata, phenotype=phenotype, label="phenotype") 

This generally did work well for most phenotypes, however I am struggling with plasma cells here. This is my matrix for gating: Screenshot 2024-06-26 at 14 29 15

I am struggling with the Plasma cells. I attached 2 files to understand the problem. In the first, you can see me setting the gate for CD138 to 1.5, which is the marker, that needs to be positive in order to define the cell as a Plasma cells (see also matrix above): Screenshot 2024-06-26 at 14 28 13

However, once I run sm.pp.rescale and sm.tl.phenotype_cells I inspect the result using:

  sm.pl.image_viewer (image_path, adata, overlay = 'phenotype', point_color='white', point_size=3, seg_mask = mask_path)

Here I can clearly see, that scimaps assigns cells to be plasma cells, that initially did not pass the gate I set before: Screenshot 2024-06-26 at 14 27 29

What could be the problem here? I also tried using the layer='log' argument in sm.pl.gate_finder, but I doubt that this is solving the issue, as the gates are just completely different then.

Happy for any help!

ajitjohnson commented 2 months ago

Hi @LukasHats

  1. can you re-run it by removing allneg from plasma cells?
  2. can you show me what adata.raw.X looks like?
  3. Our of curiosity, do these plasma cells express CD45? If they do, is there a reason why it is not after CD45 assignment?
LukasHats commented 2 months ago

@ajitjohnson Thanks for the super fast reply!

  1. This seems t have resolved the issue, See image:

Screenshot 2024-06-26 at 15 31 51

Initially, I included all these allneg because when immune cells are in close contact with the plasma cells (myeloma cells), they suffered from spillover and got assigned as plasma cells (myeloma cells), and to prevent this, I used the allneg strategy. But seems like this was not a good idea. However, maybe you would have an idea of preventing those missasigned cells?

2.

  adata.raw.X
  array([[ 3.96480636,  2.27990406,  6.00049857, ...,  1.66016986,
           2.39910141,  0.85182912],
         [ 5.8269676 ,  5.02511351,  1.79790638, ...,  4.58909818,
           3.02833282,  1.74198974],
         [ 5.41194391,  0.95435072,  1.42363619, ...,  2.8124897 ,
          10.61775327,  1.32483804],
         ...,
         [ 4.83710968,  1.04664985, 10.29334917, ...,  1.41952243,
           2.97784023,  0.79173471],
         [11.27220984,  2.63773545, 29.35303122, ...,  2.17003537,
           2.81499633,  3.40928472],
         [ 1.23573732,  1.65347757,  2.08084869, ...,  0.6       ,
           0.        ,  0.        ]])
  1. Normally, plasma cells should be positive for CD45, but in this case, these are myeloma cells, meaning tumor plasma cells, which can actually lose CD45 expression. But some keep it, that's why I keep them before assigning CD45 hierarchical gating.
ajitjohnson commented 2 months ago

what seemed to have happened is your allneg scoring overshadowed the CD138. This is because allneg and allpos have a slightly higher priority than neg or pos.

  1. You could try using neg on a few markers instead of all the markers you have now.
  2. You could always add a subsequent step for the classification of plasma cells -> myeloma cells if you belive myeloma cells are being mis classified as plasma cells.
LukasHats commented 2 months ago
  1. Okay, that is a good information on how these labels behave. Thanks I will try that out!

  2. So in these samples actually all plasma cells should be myeloma cells, I am more worried about misclassification of infiltrating immune cells as myeloma/plasma cells.

But thanks a lot for the fast and successful help. I will try out you suggestions and let you know how that works before closing the issue.

ajitjohnson commented 2 months ago

ah okay, in which case reclassify plasma cells into normal immune cells.

LukasHats commented 2 months ago

I checked a few more phenotyped images and the new gating scheme without the allneg seems to work quite fine. Thanks for the super fast help and insights into the gating behaviour!