sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
251 stars 47 forks source link

pcgr-summarise: AttributeError: 'dict' object has no attribute 'INFO' #205

Closed pdiakumis closed 1 year ago

pdiakumis commented 1 year ago

Running latest PCGR v1.3.0 using GRCh38, I think it stumbles on the recent change in pcgr-summarise (https://github.com/sigven/pcgr/commit/dd037cb404658bbd2675cb34daf21221666091e2), but it might just be the test VCF I'm using (though this hasn't happened with previous PCGR versions):

Version ``` $ pcgr --version pcgr 1.3.0 ```
Command ``` INPUT="../input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz" SAMPLE="SEQC" OUTDIR="../out/pcgr/${SAMPLE}" pcgr \ --debug \ --output_dir "${OUTDIR}" \ --assay "WGS" \ --control_dp_tag "NORMAL_DP" \ --control_af_tag "NORMAL_AF" \ --tumor_dp_tag "TUMOR_DP" \ --tumor_af_tag "TUMOR_AF" \ --estimate_tmb \ --force_overwrite \ --genome_assembly "grch38" \ --input_vcf "${INPUT}" \ --pcgr_dir "../" \ --report_theme "default" \ --sample_id "${SAMPLE}" \ --include_trials \ --estimate_msi_status \ --vep_buffer_size 5000 \ --vep_pick_order "biotype,rank,appris,tsl,ccds,canonical,length,mane" \ --pcgrr_conda umccrise_pcgrr ```
Log ``` 2023-03-07 12:37:34 - pcgr-validate-input-arguments - INFO - PCGR - STEP 0: Validate input data and options 2023-03-07 12:37:34 - pcgr-validate-input-arguments - INFO - pcgr_validate_input.py /Users/pdiakumis/projects/sigverse /Users/pdiakumis/projects/sigverse/input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz None None None None 1 0 grch38 None TUMOR_DP TUMOR_AF NORMAL_DP NORMAL_AF _NA_ 0 0 --output_dir /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC --debug 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Skipping validation of VCF file (deprecated as of Dec 2021) 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Checking if existing INFO tags of query VCF file coincide with PCGR INFO tags 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - No query VCF INFO tags coincide with PCGR INFO tags 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for normal/control allelic fraction (control_af_tag NORMAL_AF) in input VCF 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for normal/control variant sequencing depth (control_dp_tag NORMAL_DP) in input VCF 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for tumor variant allelic fraction (tumor_af_tag TUMOR_AF) in input VCF 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for tumor variant sequencing depth (tumor_dp_tag TUMOR_DP) in input VCF 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Extracting variants on autosomal/sex/mito chromosomes only (1-22,X,Y, M/MT) with bcftools 2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - bcftools view /Users/pdiakumis/projects/sigverse/input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz | bgzip -cf > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz && tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz && bcftools sort --temp-dir /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC -Oz /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz 2> /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/bcftools_1.pcgr_simplify_vcf.log && tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - bcftools view --regions chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY,chrM,chrMT,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,M,MT /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz | egrep -v '^##FORMAT=' | cut -f1-8 | sed 's/^chr//' > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp1.vcf 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - All sites seem to be decomposed - skipping decomposition! 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - cp /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp1.vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - bgzip -f /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf.gz 2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - Finished pcgr-validate-input-arguments ---- 2023-03-07 12:37:39 - pcgr-start - INFO - --- Personal Cancer Genome Reporter workflow ---- 2023-03-07 12:37:39 - pcgr-start - INFO - Sample name: SEQC 2023-03-07 12:37:39 - pcgr-start - INFO - Tumor type: Any 2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - type: WGS 2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - mode: Tumor vs. Control 2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - coding target size: 34Mb 2023-03-07 12:37:39 - pcgr-start - INFO - Genome assembly: grch38 2023-03-07 12:37:39 - pcgr-start - INFO - Mutational signature estimation: OFF 2023-03-07 12:37:39 - pcgr-start - INFO - MSI classification: ON 2023-03-07 12:37:39 - pcgr-start - INFO - Mutational burden estimation: ON 2023-03-07 12:37:39 - pcgr-start - INFO - Include molecularly targeted clinical trials (beta): ON ---- 2023-03-07 12:37:39 - pcgr-vep - INFO - PCGR - STEP 1: Basic variant annotation with Variant Effect Predictor (105, GENCODE 39, grch38) 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - one primary consequence block pr. alternative allele (--flag_pick_allele) 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - transcript pick order: biotype,rank,appris,tsl,ccds,canonical,length,mane 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - transcript pick order: See more at https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick_options 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - GENCODE set: GENCODE - basic transcript set (--gencode_basic) 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - skip intergenic: FALSE 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - regulatory annotation: OFF 2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - buffer_size/number of forks: 5000/4 2023-03-07 12:37:39 - pcgr-vep - INFO - unset PERL5LIB && export PATH=/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin:"$PATH" && vep --input_file /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf.gz --output_file /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz --dir /Users/pdiakumis/projects/sigverse/data/grch38/.vep --assembly GRCh38 --cache_version 105 --fasta /Users/pdiakumis/projects/sigverse/data/grch38/.vep/homo_sapiens/105_GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --pick_order biotype,rank,appris,tsl,ccds,canonical,length,mane --buffer_size 5000 --fork 4 --hgvs --af --af_1kg --af_gnomad --variant_class --domains --symbol --protein --ccds --mane --uniprot --appris --biotype --tsl --canonical --format vcf --cache --numbers --total_length --allele_number --no_stats --no_escape --xref_refseq --vcf --check_ref --dont_skip --flag_pick_allele_gene --plugin NearestExonJB,max_range=50000 --force_overwrite --species homo_sapiens --offline --compress_output bgzip --verbose --gencode_basic Possible precedence issue with control flow operator at /Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805. 2023-03-07 12:37:56 - pcgr-vep - INFO - tabix -f -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz 2023-03-07 12:37:56 - pcgr-vep - INFO - Finished pcgr-vep ---- 2023-03-07 12:37:56 - pcgr-vcfanno - INFO - PCGR - STEP 2: Annotation for precision oncology with pcgr-vcfanno 2023-03-07 12:37:56 - pcgr-vcfanno - INFO - Annotation sources: ClinVar, dbNSFP, UniProtKB, cancerhotspots.org, CiVIC, CGI, DoCM, CHASMplus driver mutations, TCGA, ICGC-PCAWG 2023-03-07 12:37:56 - pcgr-vcfanno - INFO - pcgr_vcfanno.py /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf /Users/pdiakumis/projects/sigverse/data/grch38 --num_processes 4 --chasmplus --dbnsfp --docm --clinvar --icgc --civic --cgi --tcga_pcdm --winmsk --simplerepeats --tcga --uniprot --cancer_hotspots --pcgr_onco_xref --debug --keep_logs 2023-03-07 12:37:57 - pcgr-vcfanno - INFO - vcfanno -p=4 /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.tmp.conf.toml /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.tmp.unsorted.1 2> /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.log 2023-03-07 12:37:59 - pcgr-vcfanno - INFO - bgzip -f /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf 2023-03-07 12:37:59 - pcgr-vcfanno - INFO - tabix -f -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.gz 2023-03-07 12:37:59 - pcgr-vcfanno - INFO - Finished pcgr-vcfanno ---- 2023-03-07 12:37:59 - pcgr-summarise - INFO - PCGR - STEP 3: Cancer gene annotations with pcgr-summarise 2023-03-07 12:37:59 - pcgr-summarise - INFO - pcgr_summarise.py /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.gz 0 0 /Users/pdiakumis/projects/sigverse/data/grch38 --debug Traceback (most recent call last): File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 165, in __main__() File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 28, in __main__ extend_vcf_annotations(args.vcf_file, args.pcgr_db_dir, logger, args.pon_annotation, args.regulatory_annotation, args.cpsr, args.debug) File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 138, in extend_vcf_annotations rec.INFO[k] = record[k] AttributeError: 'dict' object has no attribute 'INFO' ```
sigven commented 1 year ago

I am on it👍

sigven commented 1 year ago

@pdiakumis, Could you test with https://github.com/sigven/pcgr/tree/pick_trans_consequence_patch?

pdiakumis commented 1 year ago

Works splendidly, thanks a lot ;-)