dariober / cnv_facets

Somatic copy variant caller (CNV) for next generation sequencing
Other
67 stars 15 forks source link

Segmentation values for targeted panels? #34

Closed vymao closed 3 years ago

vymao commented 3 years ago

I am trying to run FACETs on targeted panel data, but I am not sure what critical values to use for segmentation. Are there values that I should use, or could I learn more about what exactly this flag is for?

  --cval CVAL CVAL, -cv CVAL CVAL
                        Critical values for segmentation in pre-processing and
                        processing. Larger values reduce segmentation. [25 150] is
                        facets default based on exome data. For whole genome consider
                        increasing to [25 400] and for targeted sequencing consider
                        reducing them. Default 25 150
dariober commented 3 years ago

I don;t have a definitive answer for this. I would suggest looking into the orginal publication for FACETS and also searching the issues submitted to the facets repository. I think this has been asked before there.

The short answer is probably just to try different values and inspect the results to see what looks more sensible.

vymao commented 3 years ago

I see. And is there a reason why the second value cannot be lower than the first value? I found an article describing their methodology (Bielski, C.M., Zehir, A., Penson, A.V. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat Genet 50, 1189–1195 (2018). https://doi-org.ezp-prod1.hul.harvard.edu/10.1038/s41588-018-0165-1):

Estimates of tumor purity and ploidy, as well as genome-wide total, allele-specific, and integer DNA copy number, were inferred from sequencing data using the FACETS algorithm (version 0.3.9)11. We utilized a two-pass implementation whereby a low-sensitivity run (cval = 100) first determined the purity and tumor-normal log-ratio corresponding to the diploid state. Gene-level segmentation and integer copy-number calls were inferred from a subsequent run with higher sensitivity for focal events (cval = 50)

I'm trying to understand if these correspond to the preprocessing and processing values or if these are two separate runs that both correspond to processing values.

dariober commented 3 years ago

is there a reason why the second value cannot be lower than the first value?

I'm, not sure about that - for this questions you may be better off asking the developers of facets the R package.

I'm trying to understand if these correspond to the preprocessing and processing values or if these are two separate runs that both correspond to processing values.

My understanding is that they run facets twice. Whether the cval they quote applies to the preprocessing or the processing I don't know. You may need to ask the authors for that.