Closed asmlgkj closed 2 years ago
I believe that this issue is encountered when the plugin files (in your case Wildtype.pm) are downloaded as html. Did you use the pvacseq install_vep_plugin
command to download the plugin files or did you download from GitHub manually? Please ensure you are downloading the raw file (https://raw.githubusercontent.com/griffithlab/pVACtools/master/tools/pvacseq/VEP_plugins/Wildtype.pm). If this doesn't resolve your issue, please attach the Wildtype.pm you are using for further debugging.
Since VEP was unable to compile your Wildtype.pm file, it did not complete annotation with that plugin which is why you are seeing the pVACseq error in your second screenshot.
thnasthe wildtype.pm now is ok, but frameshit still not work, my vcf is from vardict
thanks a lot, after I ignore the warning,
run command
docker run --rm --user id -u
:id -g
-v /home/DATA/kobe:/data docker.io/griffithlab/pvactools pvacseq run /data/FL202109054CASE.filter123_vep.vcf_line_no_dot.filter.vcf_line_new FL202109054 HLA-A02:01,HLA-B35:01 MHCflurry NetMHCpan NetMHCIIpan /data -e1 8,9,10,11 --iedb-install-directory /opt/iedb -t 8 --tdna-vaf 0.05 --normal-sample-name FL202109054_N -k --downstream-sequence-length 1000
another strange thing comes,
due to the vep annotate vcf is too big, I just grep -e '#' -e 'frameshift_variant' > test.vcf and change the name to a zip just for uploding to github, you can just rename it with .vcf test.ZIP
### this is the origin vcf used for vep anno FL202109054CASE.filter123.vcf.gz
this is the pm I used, rename it just for uploading, thanks a lot Frameshift.zip
Taking a quick glance at your VCF, it looks like the variants in your VCF aren't actually called in your tumor sample, i.e. they have a 0/0 genotype. pVACseq only processes variants that were actually observed in your sample of interest.
vardict output this, it has 0/0, 0/1, 1/1
zgrep -v '#' ~/Downloads/FL202109054CASE.filter123.vcf.gz | cut -f 10 | cut -f 1 -d : | sort | uniq -c
70590 0/0
As you can see, all of the variants in the VCF you provided are homozygous reference in the tumor sample, i.e. the variants are not observed in the tumor sample. They cannot be processed by pVACseq.
so pvactools use what genotype variant, does it first filter out the 0/0 in the tumor sample variants?
in the code https://github.com/genome/analysis-workflows/blob/master/definitions/pipelines/detect_variants.cwl I does not find vardict vcf, but combine: run: ../tools/combine_variants.cwl in: reference: reference mutect_vcf: mutect/filtered_vcf strelka_vcf: strelka/filtered_vcf varscan_vcf: varscan/filtered_vcf pindel_vcf: pindel/filtered_vcf out: [combined_vcf]
does it mean vardict output has not been tested? <> if I just replace all the 0/0 in tumor sample(tumor-only mode or pair mode) sed '/^chr/s#0/0#0/1#' a.vcf < b.vcf will pvactools use all the variants? <> hope for the answer thanks a lot
any comment about this @susannasiebert thanks a lot
I don't believe we've tried vardict. @jasonwalker80 @malachig can you confirm?
Editing the GT field the way you are doing should work but I think it will edit both the tumor and the sample genotypes. If you want to preserve the existing information you can also add a third, dummy sample to your VCF using the VAtools vcf-genotype-annotator (https://vatools.readthedocs.io/en/latest/vcf_genotype_annotator.html). You will need to redo any work you have done to add readcounts and expression information to your tumor sample and add them to the new dummy sample instead.
I don't know how you are planning on using the results from pVACseq but I would be very careful with any changes like this. You don't want to predict epitopes for variants that don't actually occur in a patient as vaccination with them will at best not work and at worst elicit a response against normal cells.
thanks a lot, what is your meaning of edit both the tumor and the sample genotypes., how to edit? <> if I just replace all the 0/0 in tumor sample(tumor-only mode or pair mode) sed '/^chr/s#0/0#0/1#' a.vcf < b.vcf will pvactools use all the variants? I am here wanting to know how pvacseq select variants by GT, is there any concrete filter infomation
I apologize. I meant to say that your sed command will replace 0/0
in both the tumor and the normal sample. If you're ok with that then your sed command will work. pVACseq accept any variant that's not homozygous reference (0/0
). 0/1
, 1/0
, and 1/1
genotypes will all be processed.
@susannasiebert thanks a lot. pVACseq accept any variant that's not homozygous reference (0/0) . here means in the tumor or normal GT, or for both must be 0/1. by the way, do you know what 1/0 mean? I really can not find reference about 1/0
Only the tumor sample needs to be called. The genotype of the normal sample is not taken into account.
I don't know what 1/0 means. It might just be a chromsome-aware representation of 0/1 but that would be weird without phasing soI really don't know for sure.
Normally the order of the alleles (1|0 vs 0|1) is used to define phasing. A haplotype is built by combining all alleles in the first column and all alleles in the second column. Normally phased alleles are indicated by separation with a "|" instead of a "/". An example of "1|0" is shown in the first page of the VCF spec:
https://samtools.github.io/hts-specs/VCFv4.2.pdf
Here you have "1/0" though. Maybe it still relates to phasing but from a tool process that does not use the "|"? Just speculation. How was your VCF created out of curiousity?
On Thu, Sep 30, 2021 at 9:09 AM Susanna Kiwala @.***> wrote:
Only the tumor sample needs to be called. The genotype of the normal sample is not taken into account.
I don't know what 1/0 means. It might just be a read-aware representation of 0/1 but I really don't know for sure.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/griffithlab/pVACtools/issues/705#issuecomment-931357688, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRFGC46LFRIVC6CWXXZALUERVQHANCNFSM5EMFYCCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Only the tumor sample needs to be called. The genotype of the normal sample is not taken into account.
I don't know what 1/0 means. It might just be a chromsome-aware representation of 0/1 but that would be weird without phasing soI really don't know for sure. so sed '/^chr/s#0/0#0/1#' still works for the variants that I want to pass to pvacseq, because this command will not change 1/0 0/1 1/1 in the tumor GT , but just replaced 0/0 ,am I right? thanks a lot
so sed '/^chr/s#0/0#0/1#' still works for the variants that I want to pass to pvacseq, because this command will not change 1/0 0/1 1/1 in the tumor GT , but just replaced 0/0 ,am I right?
Like I said previously, yes, it should work.
Normally the order of the alleles (1|0 vs 0|1) is used to define phasing. A haplotype is built by combining all alleles in the first column and all alleles in the second column. Normally phased alleles are indicated by separation with a "|" instead of a "/". An example of "1|0" is shown in the first page of the VCF spec: https://samtools.github.io/hts-specs/VCFv4.2.pdf Here you have "1/0" though. Maybe it still relates to phasing but from a tool process that does not use the "|"? Just speculation. How was your VCF created out of curiousity? … On Thu, Sep 30, 2021 at 9:09 AM Susanna Kiwala @.***> wrote: Only the tumor sample needs to be called. The genotype of the normal sample is not taken into account. I don't know what 1/0 means. It might just be a read-aware representation of 0/1 but I really don't know for sure. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#705 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRFGC46LFRIVC6CWXXZALUERVQHANCNFSM5EMFYCCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thanks a lot. it is a vcf generated by tool vardict, I do not know the inner phased method. I have no idea about what you say ' How was your VCF created out of curiousity?'. because english is not my first language, I am afraid of misunderstanding your question
so sed '/^chr/s#0/0#0/1#' still works for the variants that I want to pass to pvacseq, because this command will not change 1/0 0/1 1/1 in the tumor GT , but just replaced 0/0 ,am I right?
Like I said previously, yes, it should work. Thanks a lot
@malachig I believe this VCF was created using VarDict as the variant caller.
Some of the developers talking about this exact question: https://github.com/AstraZeneca-NGS/VarDict/issues/34
On Thu, Sep 30, 2021 at 9:50 AM Susanna Kiwala @.***> wrote:
@malachig https://github.com/malachig I believe this VCF was created using VarDict https://github.com/AstraZeneca-NGS/VarDict as the variant caller.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/griffithlab/pVACtools/issues/705#issuecomment-931394382, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRFGD3VLEFTAXJUFAO7JLUER2NHANCNFSM5EMFYCCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
its GT is very strange, taged as strongmatic the GT in tumor is also 0/0, here is some record, <> <>
chr1 2488079 . C T 115 f0.02;p8 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=559;VD=6;AF=0.0107;SHIFT3=0;MSI=1.000;MSILEN=1;SSF=0.09318;SOR=Inf;LSEQ=CATCCTGCTAGCTGGGTTCC;RSEQ=GAGCTGCCGGTCTGAGCCTG GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:559:6:6,0:519,34:553,6:0.0107:2,0:6:1:44.8:1:1:0:60:12:0.0107:0.0089:1.3 0/0:270:0:0,0:156,114:270,0:0:2,0:43.7:1:36.3:1:1:0:60:89:1:0:0.5 chr1 2488170 . C A 104 f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=643;VD=5;AF=0.0078;SHIFT3=0;MSI=1.000;MSILEN=1;SSF=0.20046;SOR=Inf;LSEQ=CCAAAACCGACGTCTTGAGG;RSEQ=TGGTGAGCCCCCGAGCCTCC GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:643:5:0,5:128,510:638,5:0.0078:2,0:9.4:1:45:1:0.5888:0:60:10:0.0078:0.0062:2.6 0/0:243:0:0,0:89,154:243,0:0:2,0:39.6:1:36.6:1:1:0:60:120.5:1:0:0.7 chr1 2489179 . C A 45 v3;f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=427;VD=2;AF=0.0047;SHIFT3=0;MSI=2.000;MSILEN=1;SSF=0.41822;SOR=Inf;LSEQ=CCTTAGGTGCTGTATCTCAC;RSEQ=TTCCTGGGAGCCCCCTGCTA GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:427:2:2,0:240,185:425,2:0.0047:2,0:21.5:1:45:1:0.50776:0:60:4:0.0047:0.0023:1 0/0:233:0:0,0:129,104:233,0:0:2,0:40.7:1:36.2:1:1:0:60:45.6:1:0:0.2 chr1 2489200 . C T 45 v3;f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=449;VD=2;AF=0.0045;SHIFT3=0;MSI=1.000;MSILEN=1;SSF=0.4111;SOR=Inf;LSEQ=TTCCTGGGAGCCCCCTGCTA;RSEQ=GCCCCAGCTCTGCCGTCCTG GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:449:2:1,1:213,234:447,2:0.0045:2,2:39:1:45:0:1:1.10:60:4:0.0045:0:1 0/0:251:0:0,0:120,131:251,0:0:2,0:39.7:1:36.5:1:1:0:60:250:1:0:0.2 chr1 2489235 . AC A 45 v3;f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=Deletion;DP=410;VD=2;AF=0.0049;SHIFT3=2;MSI=3.000;MSILEN=1;SSF=0.3879;SOR=Inf;LSEQ=GTCCTGCAAGGAGGACGAGT;RSEQ=CCAGTGGGCTCCGAGTGCTG GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:410:2:0,2:146,262:408,2:0.0049:2,0:17:1:45:1:0.54029:0:60:4:0.0049:0.0024:1 0/0:248:0:0,0:104,144:248,0:0:2,0:40.5:1:36.4:1:1:0:60:81.667:1:0:0.2 chr1 2489837 . C T 70 f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=287;VD=3;AF=0.0105;SHIFT3=0;MSI=3.000;MSILEN=1;SSF=0.19171;SOR=Inf;LSEQ=GGCACAGTGTGTGAACCCTG;RSEQ=CCTCCAGGCACCTACATTGC GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:287:3:3,0:126,158:284,3:0.0105:2,0:13.7:1:44.7:1:0.08964:0:60:6:0.0105:0.007:2.3 0/0:210:0:0,0:110,100:210,0:0:2,0:40.9:1:36:1:1:0:60:41:1:0:0.3 chr1 2489914 . G A 45 v3;f0.02 STATUS=StrongSomatic;SAMPLE=HH0321091801;TYPE=SNV;DP=265;VD=2;AF=0.0075;SHIFT3=0;MSI=2.000;MSILEN=1;SSF=0.33134;SOR=Inf;LSEQ=AATGTGTGACCCAGGTAAGA;RSEQ=GCCAGCACAGCCGGCCCAGC GT:DP:VD:ALD:RD:AD:AF:BIAS:PMEAN:PSTD:QUAL:QSTD:SBF:ODDRATIO:MQ:SN:HIAF:ADJAF:NM 0/0:265:2:1,1:107,156:263,2:0.0075:2,2:25:1:45:1:1:1.46:60:4:0.0075:0.0038:1.5 0/0:195:0:0,0:85,110:195,0:0:2,0:36.3:1:36.3:1:1:0:60:64:1:0:0.2
@asmlgkj Which version of VarDict are you using?
@asmlgkj Which version of VarDict are you using? thanks a lot the latest https://github.com/AstraZeneca-NGS/VarDictJava/releases/tag/v1.8.2. I also git clone the github and compile, it is also the same
thanks a lot
Describe the bug Failed to compile plugin Wildtype: Excessively long <> operator at /data/database/anno/vep104/VEP_plugins-release-104/Wildtype.pm line 20. <> se of uninitialized value in addition (+) at Frameshift.pm line 129, <$fh> line 281368
To Reproduce docker run --rm --user
id -u
:id -g
-v /home/DATA/kobe:/data docker.io/ensemblorg/ensembl-vep vep --input_file /data/FL202109054CASE.filter123.vcf.gz --output_file /data/FL202109054CASE.filter123.vcf.gz_vep --dir_cache /data/database/anno/vep104 --dir_plugins /data/database/anno/vep104/VEP_plugins-release-104 --fasta /data/database/anno/vep104/Homo_sapiens.GRCh37.dna.primary_assembly_chr.fa --offline --cache --force_overwrite --transcript_version --refseq --assembly GRCh37 --format vcf --cache_version 104 --keep_csq --variant_class --vcf --sift b --polyphen b --ccds --hgvs --symbol --numbers --canonical --gene_phenotype --af_1kg --af_esp --af_gnomad --pubmed --var_synonyms --variant_class --fork 4 --check_existing --phased --numbers --xref_refseq --plugin Frameshift --plugin Wildtype --tsl --terms SO Log OutputOutput File