dariober / cnv_facets

Somatic copy variant caller (CNV) for next generation sequencing
Other
67 stars 15 forks source link

WGS recommendations for chromothripsis analysis #40

Open ahwanpandey opened 3 years ago

ahwanpandey commented 3 years ago

Hello,

I am trying to detect chromothripsis in some ovarian cancer data using the tool ShatterSeek (https://github.com/parklab/ShatterSeek). The tool takes in SV calls and CNV segments with TCN for each sample and outputs stats and plots for each chromosome. One criteria for it to call an event as having chromothripsis is to count the number of SVs in a candidate region along with the number of "oscillating" segments between different TCN states.

I have attached a plot below that compares three samples and three high confidence chromothripsis regions (columns) for my analysis using FACETS and with the analysis done by the PCAWG group (https://www.nature.com/articles/s41588-019-0576-7). My FACETS analysis doesn't show the Minor CN, but that's not really relevant here. What is relevant is that the number of "oscillating" segments is far smaller than what the PCAWG analysis shows. Their analysis also makes more sense in conjuction with the SVs in the region of chromothripsis.

image

The main question I am trying to ask is how can I tweak the parameters of FACETS such that the segmentation can be made similar or as close to the PCAWG results? You can see that their resolution is much smaller and can detect many more focal events than my FACETS results. This is the exact same sequencing data by the way.

I have tried the following combinations of parameters and they all give similar results, the focal resolution is missing:

For snp-pileup

snp-pileup \
    --gzip \
    --pseudo-snps 100 \
    --min-map-quality 10 \
    --min-base-quality 10 \
    --max-depth 5000 \
    --min-read-counts 15,0 \
    $COMMON_SNPS_VCF \
    "$OUTPUT_DIR"/"$OUTFILE_NAME" \
    $NORMAL_BAM_FILE \
    $TUMOR_BAM_FILE

Various FACETS parameter runs (note, only changed pre-processing and processing cvals and nbhd-snp)

CVAL = Critical values for segmentation in pre-processing and processing
NBHD_SNP = If an interval of size nbhd-snp contains more than one SNP, sample a random one
DEPTH = Minimum and maximum depth in normal sample for a position to be considered

"1": {
        "CVAL": "25 500",
        "NBHD_SNP": "500",
        "DEPTH": "15 5000",
},
"2": {
        "CVAL": "25 750",
        "NBHD_SNP": "500",
        "DEPTH": "15 5000",
},
"3": {
        "CVAL": "50 1000",
        "NBHD_SNP": "500",
        "DEPTH": "15 5000",
},
"4": {
        "CVAL": "50 1000",
        "NBHD_SNP": "1000",
        "DEPTH": "15 5000",
}

The data is 30-40x Normal and 60-80x Tumor.

Here is an example of the FACETS profile plot for SAMPLE2 in the above plot with CVAL=[50, 1000], NBHD_SNP=[500], DEPTH=[15, 5000]

image

As you can see, these genomes are highly scarred already.

Just hoping to get any feedback you can for me to replicate the PCAWG analysis using FACETS!

Thanks! Ahwan