lima1 / PureCN

Copy number calling and variant classification using targeted short read sequencing
https://bioconductor.org/packages/devel/bioc/html/PureCN.html
Artistic License 2.0
127 stars 32 forks source link

PureCN with ultra-deep sequenced plasma gene-panels #323

Closed RomainBosselut closed 11 months ago

RomainBosselut commented 1 year ago

Hello,

I work with ultra-deep sequencing data from plasma on a small panel (about 200 kb), and I am looking for a tool that allows me to estimate the tumor fraction of my plasma samples. Do you have any advice on the parameters to use with PureCN for this type of data to obtain good results for purity and ploidy?

Thanks in advance, Romain

lima1 commented 1 year ago

Hi Romain, if it's hybrid capture and you have off-target reads, there is a chance. You will probably need to do manual curation, but there is a good chance PureCN will find a reasonable solution.

For our own plasma panel (2.5Mb at 3000X) we use PureCN for the > 10% purity samples and then have XGboost models based on a couple of features, mostly extracted from PureCN (like log-ratio autocorrelation, read pile-ups at known amplification sites, TMB, 90% percentile of somatic mutation allelic fraction, SNP unbalancedness, Cosmic hits and CMC tiers, ....). At some point I'll clean up and put it out here. Sorry for not having a better answer.