angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
70 stars 25 forks source link

Apply Pixie to Codex dataset #1065

Open RYY0722 opened 11 months ago

RYY0722 commented 11 months ago

Hi, thanks a lot for the great work and congratulations!!

I wish to apply this wonderful method on my codex dataset to reduce manual work! However, I found that some cells are misclassified after the pixel and cell cluster annotation. I felt lost when I looked back and want to optimize the annotations. May I have some insights from you? Thanks a lot!

  1. I got a slide of 28800x66240 pixels. Would you give some suggestions for selecting regions for annotation? Like, few ones with interesting structure? or scattered small fovs would be a better choice? Does size of fov matter greatly?
  2. After I found that some cells are misclassified, what should I adjust to refine the resutls? image Like the image above shows the cells with label "endothelial cell". However, some of them do not express CD31 (the marker we select at all), and some of the cells with high CD31 expression are not labeled as endothelial cells. Would you please give me some suggestions under these circumstance?

Thank you very much and looking forward to you reply!!

ngreenwald commented 10 months ago

Hey @RYY0722, I would recommend looking at the pixel cluster output to ensure that everything you included in the CD31 cluster expresses CD31, and that you didn't miss any. If the individual pixels in the CD31 cluster are not actually expressing CD31, then when you go to annotate cell clusters, those inaccuracies will propagate downstream.

If you want to QC the results, the important thing is that you have representative images of the important regions. If random sampling will do the trick, you could do that, or a guided approach where you select specific regions

RYY0722 commented 10 months ago

Hi @ngreenwald, thanks a lot for the suggestions!! So here is my interpretation: since we are annotating at pixel-level, some may not be necessarily co-expressed. For example, we are working on glioma samples, and AC-like tumor cells are characterized by GFAP2+SOX2. However, GFAP2 and SOX2 may not be necessarily overlapped at pixel-level. Therefore, we should annotate clusters with GFAP2 positive even though if it may contain other markers, like CD44, CD20, etc.

  1. In this case, we shall annotate pixel-level clusters at protein level. For example, "GFAP+", "Foxp3+", etc.
  2. In the next step, the heatmap of cell-level clusters should be cleaner, and the annotation should be more easy and convincing.
  3. Then we can use combination of annotations at pixel-level to annotate cells.

For another thing, what kind of QC steps will you suggest? Could you give me some examples? Thank you very much!!

Looking forward to your reply!

ngreenwald commented 10 months ago

Whatever decisions you make at the pixel level will impact your cell clustering. If you have a pixel cluster that expresses GFAP and CD20, but you think the CD20 is artifactual/unimportant, then labeling that cluster GFAP makes sense. However, if you think the GFAP/CD44 co-expression represents actual signal that is expressed by both, and you label it GFAP only, then when you do cell clustering, you won't be able to separate GFAP+CD44- from GFAP+CD44+. It all depends on how you want to do cell clustering, if this is a problem or not.

For QC, we use Mantis viewer currently. The notebook/README explains how we use it.

RYY0722 commented 10 months ago

Noted with thanks!