We believe criteria for cnvs to have 50% reciprocal overlap might be too stringent. After the #1116 we are missing some subtype defining CNVs.
What changes need to be made? Please provide enough detail for another participant to make the update.
After investigation we see the CNV alterations n controlfreec and cnkit . For example chr19 amplification in BS_K07KNTFY
is seen in both controlfreec and cnvkit but missed out of consensus calls because cnvkit region is 11% of controlfreec region
## For list2's CNV
## If any overlap exists,
## then we add in the start, end coordinate, total overlap length, and total len to different lists
## This is done to account for 1 CNV from list1 overlapping with MULTIPLE CNVs from list2
if (end - start +1) / (end_list2 - start_list2 + 1) >= 0:
What analysis module should be updated and why?
We believe criteria for cnvs to have 50% reciprocal overlap might be too stringent. After the #1116 we are missing some subtype defining CNVs.
What changes need to be made? Please provide enough detail for another participant to make the update.
After investigation we see the CNV alterations n controlfreec and cnkit . For example chr19 amplification in BS_K07KNTFY is seen in both controlfreec and cnvkit but missed out of consensus calls because cnvkit region is 11% of controlfreec region
What was your approach?
Our approach was to broaden the criteria to include CNV calls in either caller that has any overlap at this step master https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/2511e8c9fc4a7542f5b709363f866ccddb73be8b/analyses/copy_number_consensus_call/scripts/compare_variant_calling_updated.py#L148-L152
consensus-cnv-smallCNV-overlap
And at the following snippet we allow CNV overlaps that completely overlap a smaller CNV in caller X by a larger CNV in callerY master https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/2511e8c9fc4a7542f5b709363f866ccddb73be8b/analyses/copy_number_consensus_call/scripts/compare_variant_calling_updated.py#L181
consensus-cnv-smallCNV-overlap
What input data should be used? Which data were used in the version being updated?
pbta-cnv-cnvkit.seg.gz pbta-cnv-controlfreec.tsv.gz pbta-sv-manta.tsv.gz
When do you expect the revised analysis will be completed?
1day
Who will complete the updated analysis?
@kgaonkar6