Closed djb17 closed 3 years ago
Hi, we ran PureCN on OVC here: https://ascopubs.org/doi/suppl/10.1200/CCI.19.00130
I think there is a Supplemental Table with our purities and ploidies. Does it look very different from what you get?
Hi Markus. Quick as usual! I still have a handful of samples running, but these numbers seem very close to what I'm getting.
From what I skimmed in the paper that I mentioned, they employ 4 different approaches and attempt to normalize purities to come up with a consensus value. I guess I'll have to look closely in their methods, but it seems odd that our estimation is way off compared to what they reported.
Table S1 lists the values from ABSOLUTE and FACETS as well. You can also check for the TP53 somatic mutation that all of them have. 40% purity for OVC should be on the lower tail of all samples, so they might be right. Note that ABSOLUTE is from SNP6 and those tissue slides could have a slightly different purity.
If you post a screenshot of the B-allele frequency plot and the output of the log file, I can easily double check.
To your initial question: use as many normal samples (i.e. samples without somatic copy number alterations) as you can get. With tissue normal I assume you mean the matched normal? I don't think TCGA provides a lot of adjacent normals, most should be blood. Blood is fine, usually better quality. Might miss some tissue/FFPE specific noise, but not much you can do.
Providing the matched normal via --normal should almost always produce way worse results than using the normal database.
Below is the figure from the paper I mentioned in case you're interested. They reported the mean consensus purity for OVC around 90% which is why I brought this matter up.
Yes, this is expected and PureCN should give you high values for most samples. Have a look at Table S1 in our paper. Here the corresponding figure.
I might have misunderstood. I thought you ran the PureCN on TCGA samples and got around 40%. You mean your own cohort is much lower? It’s usually obvious in the B-allele frequency plot if the maximum likelihood solution is correct. Feel free to post one where you are unsure. Also maybe check the Tp53 allele frequency.
Hello again,
I was testing PureCN on TCGA samples and noticed some glaring difference in tumor purity that was previously reported in this paper (figures 1, 2; supp. figures 2, 3). In short, ovarian cohort I examined via PureCN was roughly around 40% purity whereas previously reported consensus was around 80-90%.
To provide a little bit of detail, I used matching blood normal samples to generate coverages, making sure kits are consistent. I ran the recommended production pipeline (without providing normal coverage).
Now I'm wondering if I should be generating the coverages using the tissue normal to see if changing the normal reference significantly affects tumor purity estimation.
Thank you.