spcdot / CopyscAT

Inferring CNV data from single cell ATAC seq
GNU General Public License v3.0
22 stars 6 forks source link

Interpretation of CNV scores and adjustments #4

Open Brawni opened 3 years ago

Brawni commented 3 years ago

Hi!

Thanks for sharing this package! I have run the pipeline taken from the tutorial on my own data. I have visualised the CNV scores using UMAPs and in most of the cases results look very similar to another CNV analysis based on scRNA, which is good! However there are a couple of examples which I'm not sure how to interpret. In the first figure I have a variation in the CNV score which is making sense (from scRNA CNV analysis) as coming from a subset of cancer population (top left). However the CNV score actually tells me that the blue cluster (the one which should be including CNV) is nearer to 2. Is it possible that in some cases the range is off like in this case? I tried tweaking the sdCNV parameter but nothing changed. The second example shows the cancer cluster with deletion in chr22 (also reported from scRNA). However many other (normal) cells outside this cluster have the same CNV which presumably are false positives. Is there a way to adjust for this? Also in this case after changing the sdCNV parameter multiple times the UMAP was identical.

image image

Thanks a lot!

spcdot commented 3 years ago

Hi Bruno,

I have encountered the same thing in some of the datasets I have analysed. Most often in my cells it also seems to be chromosome 19 - partly due to the high density of genes, I find the normalisation method I use tends to overcorrect the signal. If you look at the cnv_summary graph, I bet the baseline on these chromosome arms will be either far below or (less commonly) above the computed expected signal value. I find it tends to vary by sample type (the blood cancers I studied seem less affected than the brain tumours for some reason). Because of this the baseline for these chromosomes tends to be off -- this can lead to false positives in some cases, or an incorrect # as you noticed in others. I'm still thinking about how best to address this, perhaps an option to do a manual adjustment of the corrected signal on certain chromosome arms may do the trick. I'm a bit busy with other things at the moment, but I may play around with that in the next update.

Cheers,

Ana

Brawni commented 3 years ago

Hi Ana,

Thanks for your reply. Here is the cnv_summary violinplot. Chr 19 seems a bit below as you mention even though chr 22 and 21 are actually the lowest. Looking forward for the next update!

Thank you!

image
yifnzhao commented 2 years ago

Hi, I am wondering if there has any update with regard to chr19 gene density. In my brain dataset I am seeing abnormally high CNV scores (>5) for most cells on chr19p. Should I interpret CNV scores as the inferred copy number state?

spcdot commented 2 years ago

Hi Yifan,

I would try using the alternate method in the tutorial (starting with the identifyNonNeoplastic subroutine). Chromosome 19 has a ton of genes and I found that depending on normalization it routinely comes out too high or too low, and this appears to also be somewhat dataset-dependent. In that second part where it has "nmf_results$normalBarcodes", if you have a population you suspect is "normal" based on other markers or absence of other CNVs, you can run that part of the tutorial, and finish by calling annotateCNV4B. Alternately, you can use the results of identifyNonNeoplastic, but I would check those first visually in something like Signac to make sure they make sense. That will set the baseline based on that "normal" population instead of the average across all cells. Calling the cells in this way may help you get more easily interpretable results - I find chromosome 19 especially to be tricky, and recommend interpreting the data in the context of other chromosomal alterations.

yifnzhao commented 2 years ago

Thanks for your prompt reply! I will try to identify a "normal" population first.

Best, Yifan