AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
101 stars 67 forks source link

Update manta FILTER=='PASS' Part1 : consensus cnv file generation #1114

Closed kgaonkar6 closed 3 years ago

kgaonkar6 commented 3 years ago

Purpose/implementation Section

What scientific question is your analysis addressing?

Adding a FILTER==PASS for manta calls used in copy_number_consensus_call. I think this filter is necessary so that we use a subset of high-confidence broad SVs called by manta. https://github.com/Illumina/manta/blob/master/docs/userGuide/README.md#vcf-format-fields

What was your approach?

Added

awk '$6~/DEL/ {{if ($5 > {params.SIZE_CUTOFF} && $11 == 'PASS') {{print "chr"$2,$3,$4,$5,"NA","NA","NA",$6}}}}' {input}

since 11th column is FILTER

What GitHub issue does your pull request address?

1113

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

NA

Is there anything that you want to discuss further?

I will be updating the focal-cn-file-preparation and oncoprint with the updated consensus cnv as different PRs. Does that sound ok or should I add the updates in this PR since it's only reruns?

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

tables

What is your summary of the results?

Broad SVs calls from manta that don't have FILTER=='PASS' are removed. The manta FILTER columns are distributed as :

                          MaxDepth 
                               2663 
                MaxDepth;MaxMQ0Frac 
                                636 
MaxDepth;MaxMQ0Frac;MinSomaticScore 
                                931 
           MaxDepth;MinSomaticScore 
                               7189 
                         MaxMQ0Frac 
                                752 
         MaxMQ0Frac;MinSomaticScore 
                               1954 
                    MinSomaticScore 
                             251720 
                               PASS 
                             108129 

Since looking at some deep deletion in chrX discrepancy lead to this investigation I'm pointing out the differences in deep deletions in consensus cnv in this PR compared to master

# consensus_cnv in this updated PR
> consensus_cnv[which(consensus_cnv$copy.num=="0"),"chrom"] %>% table()
.
 chr1 chr10 chr11 chr13 chr14 chr17  chr2 chr22  chr4  chr6 
    5     4     1     1     1     8    13     4     3     1 
 chr7  chr8  chr9  chrX  chrY 
    2     2     7     3    36 

# consensus_cnv from master
> consensus_cnv_master[which(consensus_cnv_master$copy.num=="0"),"chrom"] %>% table()
.
 chr1 chr10 chr11 chr13 chr14 chr16 chr17  chr2 chr22  chr3 
    6     6     1     1     1     2     8    14     6     1 
 chr4  chr5  chr6  chr7  chr8  chr9  chrX  chrY 
    4     3     4     2     2     8    44    36 

We see comparable calls in both versions of consensus cnv calls in all chromosome except chrX. Majority of the deep deletions in chrX are removed if we only filter for calls that PASS all filters in manta. Suggesting, that majority of the chrX deep deletions in the current master branch are low confidence calls.

Reproducibility Checklist

Documentation Checklist

kgaonkar6 commented 3 years ago

Thanks @jharenza for pointing that out! I have added some summary to specify why we investigated this and what the results suggest. Let me know if I should add more info.