ambj / MuPeXI

MuPeXI: the mutant peptide extractor and informer, a tool for predicting neo-epitopes from tumor sequencing data.
Other
45 stars 27 forks source link

Issues when running on the web server #5

Closed kevin199011 closed 6 years ago

kevin199011 commented 7 years ago

Hi, I have a somatic mutation filtered VCF file generated by GATK haplotypeCaller. I used that file to run MuPeXI on the web server, but got the following errors. Could you please let me know what's wrong with this? Thank you!

Traceback (most recent call last): File "/usr/cbs/bio/src/MuPeXI-1.1/MuPeXI/MuPeXI.py", line 1577, in main(sys.argv[1:]) File "/usr/cbs/bio/src/MuPeXI-1.1/MuPeXI/MuPeXI.py", line 78, in main peptide_info, peptide_counters, fasta_printout, pepmatch_file_names = peptide_extraction(peptide_length, vep_info, proteome_reference, genome_reference, reference_peptides, reference_peptide_filenames, input.fasta_file_name, paths.peptide_match, tmpdir, input.webserver, input_.printmismatch, input.keeptemp, input.prefix, input.outdir, input.num_mismatches) File "/usr/cbs/bio/src/MuPeXI-1.1/MuPeXI/MuPeXI.py", line 667, in peptide_extraction peptide_info, pepmatch_file_names = normal_peptide_correction(mutated_peptides_missing_normal, mutation_info, p_length, reference_peptide_file_names, peptide_info, peptide_match, tmp_dir, pepmatch_file_names, webserver, print_mismatch, num_mismatches) File "/usr/cbs/bio/src/MuPeXI-1.1/MuPeXI/MuPeXI.py", line 772, in normal_peptide_correction assert mutated_peptide in peptide_info AssertionError

Kevin

ambj commented 7 years ago

Hi Kevin

This is most likely due to incompatibility between VCF file, VEP version, reference and/or the genomic reference used when calling the variants. What genome version was used when generating the VCF file? The webserver only runs with GRCh38 - you can choose to do a liftover from GRCh37/HG19 to GRCh38 (a box you tick of on the webserver)

As a note I would recommend using MuTect2 as variant caller, since this is the GATK combination of MuTect and HaplotypeCaller.

Best, Anne-Mette

kevin199011 commented 7 years ago

Hi Anne-Mette, the genome reference I used is GRCh38, release-90. I pasted part of the head and result part here for format reference:

fileformat=VCFv4.2

FILTER=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TUMOR

1 946247 . G A 102.28 PASS AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=34.09;SOR=1.179 GT:AD:DP:GQ:PL 1/1:0,3:3:9:130,9,0 1 953259 . T C 90.28 PASS AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=30.09;SOR=2.833 GT:AD:DP:GQ:PL 1/1:0,3:3:9:118,9,0 1 953279 . T C 98.28 PASS AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=32.76;SOR=2.833 GT:AD:DP:GQ:PL 1/1:0,3:3:9:126,9,0 1 1014274 . A G 219.80 PASS AC=2;AF=1.00;AN=2;DP=7;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=31.40;SOR=0.941 GT:AD:DP:GQ:PL 1/1:0,7:7:21:248,21,0

Thank you

Kevin

ambj commented 7 years ago

Hi Kevin I just ran the bit of the vcf file you have sent, without an error ocuring, on the web server. With the information you have given me so far, and since i'm not able to reproduce the error, I'm unfortunately not able to solve the problem your experiencing. Best, Anne-Mette

elakatos commented 6 years ago

Dear Anne-Mette, I ran into the same issues, and found that the error might not come up from just a chunk of the data. In my case I managed to find one single line that lead to the issue (and the issue happens both on the server and on my local machine). Below is the vcf file that reproduces the error. When the last line is deleted, there are no errors anymore. Can you help me what could be the issue with that one particular line? (It's the only problematic one out of 50000.)

Best, Eszter


fileformat=VCFv4.2

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

SAMPLE=

reference=file:///var/lib/cwl/job578910639_reference/GRCh38.d1.vd1.fa

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR

chr7 986527 . C A . PASS ECNT=1;HCNT=22;MAX_ED=.;MIN_ED=.;NLOD=27.62;TLOD=10.35 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:93,0:0.00:0:0:.:2406,0:49:44 0/1:57,6:0.081:5:1:0.167:1457,187:32:25

chr7 1550183 rs566323218 G A . PASS DB;ECNT=1;HCNT=35;MAX_ED=.;MIN_ED=.;NLOD=6.71;TLOD=28.11 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:27,0:0.00:0:0:.:533,0:14:13 0/1:17,10:0.385:6:4:0.600:367,308:6:11

chr7 5987342 . C T . PASS ECNT=1;HCNT=30;MAX_ED=.;MIN_ED=.;NLOD=34.15;TLOD=106.75 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:141,0:0.00:0:0:.:3797,0:69:72 0/1:77,47:0.386:25:22:0.468:2093,1428:37:40 chr7 6049896 . AT A . PASS ECNT=1;HCNT=18;MAX_ED=.;MIN_ED=.;NLOD=20.71;RPA=6,5;RU=T;STR;TLOD=67.89 GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:83,0:0.00:0:0:2306,0:57:26 0/1:74,49:0.408:21:28:2105,1426:32:42

ambj commented 6 years ago

Dear Eszter Thank you very much for making me aware of the error and for sending the problematic part of the VCF file this will make it much easier for me to trouble shoot the problem. Can i get you to send me the exact error messages you get as well? Best Anne-Mette

ambj commented 6 years ago

Dear Eszter I have not been able to reproduce the error with the newest version og MuPeXI on the development (dev) branch. This will be merged with the master in the new release by the end of this week. If you still see this error in your data - do not hesitate to contact me

Best, /AM