broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
557 stars 164 forks source link

choice of reference for bone marrow #488

Open marie2T opened 1 year ago

marie2T commented 1 year ago

Dear Professor, Many thanks for this nice tool.

I am using infercnv to identify large known CNA on bone marrow samples at single cell level. However, the selection of the reference has been tricky, as "normal" bone marrow used as reference data would not lead the infercnv towards known results, but a set of PBMC did. Therefore, I was wondering if you already have feedback about this specific question, and if I should analyse whole bone marrow sample, or selected clusters ( monocytes or lymphoid precursors) from my sample with infercnv.

Many thanks for your answer.

GeorgescuC commented 1 year ago

Hi @marie2T ,

I do not have specific experience with bone marrow samples, but I can help with the general way of running things. When doing the initial filtering of genes that are expressed in the cells provided, infercnv looks at the average expression per gene across all cells and removes all those below the cutoff value. Because of this, it is best to have matched cell types between your references and observation cells so that the expressed genes are the same (minus any loss CNV in tumor). If you have different cell types, you may run into issues because some of the lower expression genes get filtered out, removing real signal and increasing the average distance between the remaining genes, which will result in worse results. You can also end up keeping genes that are not expressed in either of the cell types because they are highly expressed in the other type, and those will dilute the signal when smoothing the data across chromosomes. These effects will vary based on the relative proportions of each cell type. This is the reason why we recommend not having too many different cell types in the same analysis and trying to match the cell types specifically, not only the organ origin. You can try to annotate cell types using marker genes outside of infercnv and run analysis where you match them with the same type of normal cells.

Regards, Christophe.