lima1 / PureCN

Copy number calling and variant classification using targeted short read sequencing
https://bioconductor.org/packages/devel/bioc/html/PureCN.html
Artistic License 2.0
127 stars 32 forks source link

pureCN output genes.csv vs loh.csv #318

Closed hfl112 closed 1 year ago

hfl112 commented 1 year ago

Hi developers, I've been curious trying to get the major CN and minor CN from the output, but both genes ouput and loh ouput have the "C" and "M" columns, so which file should I consider. I saw something comments online, said "segment of LOH is more robust" is this correct? Any comments about this will be appreciated Thanks Funan

lima1 commented 1 year ago

C is the total copy number, M is the minor allele-specific copy number. Minor + Major = Total.

Not sure what the robustness refers to. The larger the segment (number of SNPs), the more robust the LOH call. Therefore it's important to make the segmentation as clean as possible. That means reducing noise by using the best pool of normals that feasible, optimizing off-target reads, using the baits locations, not exon locations, balancing sensitivity and specificity of segmentation (too aggressive segmentation results in over segmentation and thus short segments).

Hope that helps and answers your question.