broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
557 stars 164 forks source link

Question about sample annotation file for organoids #570

Open Silvia-Bio opened 1 year ago

Silvia-Bio commented 1 year ago

Hello,

I'm seeking clarification on how to run infercnv with my samples. I have a count matrix obtained by performing smart-seq2 on three different organoids. Among them, one is an untreated control, while the other two are organoids derived from the control and subjected to different drug treatments. My goal is to use infercnv to identify CNV changes in the treated organoids with respect to the untreated control. However, I'm unsure about how to populate the second column of the annotation file. Although I have annotated the cell types present in my samples (which is not the primary focus of my analysis), I need guidance on how to proceed.

If I populate the second column of the annotation file with the treatment information (e.g., control, treatment_1, treatment_2) instead of the cell type name, would it be appropriate to use the control (known to be triploid) as the "normal" reference? Or do you have any other recommendations or advice?

I have a second question. Is it possible to add more than one annotation bar to the left of the heatmap, for example, one with the cell cycle phases?

I would greatly appreciate any assistance you can provide.

Thank you!

Silvia

gloriafight commented 10 months ago

Hi, were you able to solve this issue? I also want to know how to distinguish between cancer cells and normal cells in organoids, when I don't have organoids with normal cells as the control.

Silvia-Bio commented 10 months ago

Hi,

I used the untreated control as my "normal" reference. One thing to note with this approach is that after running the analysis, the InferCNV folder generated a file named "HMM_CNV_predictions.HMMi6.rand_trees.hmm_mode-subclusters.Pnorm_0.5.pred_cnv_regions.dat". This file provided details on the large-scale genomic regions identified within the subclones. But the file only included CNV predictions for the subclones detected in the treatment conditions and not for those in the reference subclones. I think that this happens because the analysis assumes that the reference subclones are diploid. So the way I interpreted the results is that any genomic region common between the control and treated conditions was not reported as a CNV, as it was considered normal. Conversely, any cnv change exclusively found in the treated samples was documented as a CNV event.

While this approach is not ideal, it served my purposes reasonably well.

KR,

S.