etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
502 stars 163 forks source link

Choosing a segmentation algorithm #749

Open StrawHattM opened 1 year ago

StrawHattM commented 1 year ago

Hello, I was wondering if I could maybe find some help here about how to proceed forward with my analysis.

I have run cnvkit on WGS of 5 tumors, following the recommendations for WGS, against a reference built with a healthy tissue sample. Down the line, I inted to compare the tumors that belong to two different conditions, and figure out phenotype-specific gene-level CNAs.

When it comes to segmenting, I am encountering the problem of not know which algorithm I should "choose" to keep analyzing from. I have tried using cbs, hmm and hmm_tumor, the latter being the one that from the description in the docs would suit my case better.

In the following graph I have plotted the distribution of log2 values of the segments using the three different algorithms, and that of the .cnr files I fed to the segment call. image

These are very aneuploid tumors and that is expected, and from that comparison I think CBS would be the one that reflects the .cnr distribution the best, but I am actually wondering if that is desirable or it's just indicative of more noise than the others.

Here's the comparison of the number of segments that are detected by each method in each sample: image

My questions are:

Thanks a lot for your help!!