broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
158 stars 88 forks source link

Suspected error resulting in underestimated tumour fraction #135

Closed abcoxyzide closed 2 months ago

abcoxyzide commented 2 months ago

The following code shows that the logR used in ichorCNA::HMMsegment is converted to an exponential base:

https://github.com/broadinstitute/ichorCNA/blob/5bfc03ed854f0e93fe5b624c97c1290fa0053837/R/segmentation.R#L21C2-L27C48

This is further supported by the fact that most of my male samples have chrXMedian of around -0.7. If base 2 was used, the number for a haploid region would be around -1.

I became suspicious when running ichorCNA on a sample known to have high tumour fraction (TF), for which the ichorCNA estimated TF was lower than expected.

I recalculated the TF from scratch using the reported logR numbers using an exponential base (see https://www.pnas.org/doi/full/10.1073/pnas.1009843107#eq1), and found the TF to be higher than the ichorCNA output, and is similar to what I expect.

On the other hand, I got similar TF output as ichorCNA when using base 2, leading to my worry that ichorCNA might have had an error somewhere during the inter-conversion between logR and the n parameter.

Would be great if someone could check / confirm my suspicion. Thank you.

abcoxyzide commented 2 months ago

I have overlooked that the reported logR numbers are indeed using base 2.

https://github.com/abcoxyzide/ichorCNA/blob/95972524b99be3bdd8e4be44a70b2c3b55d67ea3/R/segmentation.R#L104C3-L106C21