sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
254 stars 48 forks source link

Number of PASSed variant calls: 0 #49

Closed apastore closed 6 years ago

apastore commented 6 years ago

Hi Sigven, I have a set of Vcf file where pcgr complaints that no variant has a Filter == PASS. do you have any idea what is going on? I have tried in both tumor-only and paired mode, as well the stable and the devel versions.

Thanks a lot!

python ~/pcgr/pcgr.py --input_vcf pcgc_fail_PASS.vcf.gz ~/pcgr/ pcgc_fail_PASS grch37 ./pcgr.toml pcgc_fail_PASS

[pcgc_fail_PASS.vcf.gz](https://github.com/sigven/pcgr/files/2503951/pcgc_fail_PASS.vcf.gz

2018-10-22 20:59:19 - pcgr-validate-config - WARNING - Prediction of MSI status is not perfomed in tumor-only mode (vcf_tumor_only = true)
2018-10-22 20:59:19 - pcgr-validate-config - WARNING - Estimation of mutational burden is not performed in tumor-only mode (vcf_tumor_only = true)
2018-10-22 20:59:19 - pcgr-validate-config - WARNING - Estimation of mutational signatures is not perfomed in tumor-only mode (vcf_tumor_only = true)
2018-10-22 20:59:20 - pcgr-validate-input - INFO - STEP 0: Validate input data
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Validating VCF file with EBIvariation/vcf-validator (version 0.6)
2018-10-23 00:59:22 - pcgr-validate-input - INFO - According to the VCF specification, the VCF file /workdir/input_vcf/TRF080119.vcf.gz is valid
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Checking if existing INFO tags of query VCF file coincide with PCGR INFO tags
2018-10-23 00:59:22 - pcgr-validate-input - INFO - No query VCF INFO tags coincide with PCGR INFO tags
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Found INFO tag for tumor variant sequencing depth (tumor_dp_tag DP_T) in input VCF
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Found INFO tag for tumor variant allelic fraction (tumor_af_tag AF_T) in input VCF
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Found INFO tag for normal/control variant sequencing depth (normal_dp_tag DP_N) in input VCF
2018-10-23 00:59:22 - pcgr-validate-input - INFO - Found INFO tag for normal/control allelic fraction (normal_af_tag AF_N) in input VCF
2018-10-22 20:59:22 - pcgr-validate-input - INFO - Finished
()
2018-10-22 20:59:22 - pcgr-vep - INFO - STEP 1: Basic variant annotation with Variant Effect Predictor (94, GENCODE release 19, grch37)
2018-10-22 20:59:29 - pcgr-vep - INFO - Converting input VCF to MAF with https://github.com/mskcc/vcf2maf
2018-10-22 20:59:33 - pcgr-vep - INFO - Finished
()
2018-10-22 20:59:33 - pcgr-vcfanno - INFO - STEP 2: Annotation for precision oncology with pcgr-vcfanno (ClinVar, dbNSFP, UniProtKB, cancerhotspots.org, CiVIC, CBMDB, DoCM, TCGA, ICGC-PCAWG, IntoGen_drivers)
2018-10-22 20:59:34 - pcgr-vcfanno - INFO - Finished
()
2018-10-22 20:59:34 - pcgr-summarise - INFO - STEP 3: Cancer gene annotations with pcgr-summarise
2018-10-23 00:59:36 - pcgr-gene-annotate - INFO - Completed summary of functional annotations for 0 variants on chromosome None
2018-10-23 00:59:36 - pcgr-gene-annotate - INFO - Number of non-PASS/REJECTED variant calls: 0
2018-10-23 00:59:36 - pcgr-gene-annotate - INFO - Number of PASSed variant calls: 0
2018-10-23 00:59:36 - pcgr-gene-annotate - WARNING - There are zero variants with a 'PASS' filter in the VCF file
2018-10-22 20:59:43 - pcgr-summarise - INFO - Converting VCF to TSV with https://github.com/sigven/vcf2tsv
2018-10-22 20:59:46 - pcgr-summarise - INFO - Finished
()
2018-10-22 20:59:46 - pcgr-writer - INFO - STEP 4: Generation of output files - variant interpretation report for precision oncology
2018-10-23 01:00:15 [INFO] ------
2018-10-23 01:00:15 [INFO] Assigning elements to PCGR value boxes
2018-10-23 01:00:15 [INFO] ------
2018-10-23 01:00:15 [INFO] Writing JSON file with report contents
2018-10-23 01:00:15 [INFO] ------
2018-10-23 01:00:15 [INFO] Rendering HTML report with rmarkdown
2018-10-22 21:00:21 - pcgr-writer - INFO - Finished
sigven commented 6 years ago

Hi, Thanks for reporting! I ran your example, and you are correct that it is strange that this fails. I figured out the underlying cause now. While the vcf-validator did not find anything wrong with the query VCF (I will have to look closer at this), a tool that PCGR is using downstream (vcfanno) complained about a particular line among your VCF header lines:

vcfanno.go:115: found 26 sources from 12 files vcfanno.go:156: error parsing VCF query file /workdir/output/pcgc_fail_PASS.pcgr_ready.vep.vcf.gz: FILTER error: ##FILTER=<ID=common_variant,Description="">. [line: 4]

And this error was not sufficiently taken care of during processing (the program continued to run, but without any variants).

Quick solution: remove the FILTER line the header section of your VCF, or try adding a Description(?)

I need to work on a more robust solution to handle this, making sure that either that the vcf-validator and vcfanno have similar requirements for the query VCF

thanks, Sigve

sigven commented 6 years ago

@brentp, Do you know why vcfanno (believe this error came with 0.3.0) disapproves the encoding in the VCF above (i.e. a Description with no text in the header line for the FILTER column)? Can you point me to the VCF validation vcfanno is relying upon? regards, Sigve

brentp commented 6 years ago

I see the cause in vcfgo. will test a fix today.

sigven commented 6 years ago

thanks.

brentp commented 6 years ago

Just made a quick fix. @apastore I can verify that it will now parse your bam. If you care to replace your existing vcfanno binary with the one attached, you can proceed past this error. I'll make a new release of vcfanno this week. vcfanno_linux64.gz

sigven commented 6 years ago

Great! Thanks a lot for the rapid response and fix. Will install the new version later this week.

brentp commented 6 years ago

new release: https://github.com/brentp/vcfanno/releases/tag/v0.3.1 thanks for reporting.