MathOnco / NeoPredPipe

Neoantigens prediction pipeline for multi- or single-region vcf files using ANNOVAR and netMHCpan.
GNU Lesser General Public License v3.0
100 stars 28 forks source link

list index out of range_postprocessing.py #32

Closed ghost closed 11 months ago

ghost commented 2 years ago

Hello developer! Thank you for the wonderful tool you put together.

I got an error which I don't know where it come from. Running test worked fine, but when I use the real data it results in this error.

Screen Shot 2021-10-15 at 12 11 46 PM

INFO: Annovar reference files of build hg38 were given, using this build for all analysis. INFO: Begin. INFO: Running convert2annovar.py on ./BE3_sub1/BE3_sub1.vcf INFO: ANNOVAR VCF Conversion Process complete ./BE3_sub1/BE3_sub1.vcf INFO: Running annotate_variation.pl on ./Output/avready/BE3_sub1.avinput INFO: ANNOVAR annotation Process complete for ./Output/avready/BE3_sub1.avinput INFO: Running coding_change.pl on ./Output/avannotated/BE3_sub1.avannotated.exonic_variant_function INFO: Coding predictions complete for ./Output/avannotated/BE3_sub1.avannotated.exonic_variant_function INFO: Predicting neoantigens for BE3_sub1 INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 9.Indels INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 9 INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 8 INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 8.Indels INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 10 INFO: Running Epitope Predictions for BE3_sub1 on epitopes of length 10.Indels INFO: Predictions complete for BE3_sub1 on epitopes of length 10.Indels INFO: Digesting neoantigens for BE3_sub1 INFO: Digesting neoantigens for BE3_sub1 INFO: Digesting neoantigens for BE3_sub1 INFO: Object size of neoantigens: 22316408 Kb Traceback (most recent call last): File "NeoPredPipe.py", line 524, in main() File "NeoPredPipe.py", line 505, in main t.append(Sample(localpath, patname, patFile, hlas[patname], annPaths, netMHCpanPaths, pepmatchPaths, Options)) File "NeoPredPipe.py", line 108, in init self.digestIndSample(FilePath, pepmatchPaths, Options) File "NeoPredPipe.py", line 178, in digestIndSample self.appendedEpitopes, self.regionsPresent = AppendDigestedEps(FilePath, self.digestedEpitopes, self.patID, self.annotationReady, self.avReadyFile, Options) File "/Users/giuseppe/Documents/Giuseppe Research/CRISPR Libraries/BE3_In_vivo/NGS_Data/Trimmed/NeoAntigen_prediction/NeoPredPipe/postprocessing.py", line 153, in AppendDigestedEps genotypeFormat, genotypeIndex = DefineGenotypeFormat(testLine) File "/Users/giuseppe/Documents/Giuseppe Research/CRISPR Libraries/BE3_In_vivo/NGS_Data/Trimmed/NeoAntigen_prediction/NeoPredPipe/postprocessing.py", line 94, in DefineGenotypeFormat formatInfo = testLine.split('\t')[19].split(':') IndexError: list index out of range

elakatos commented 2 years ago

Hm, the issue seems to be in reading in the Annovar-formatted vcf file to then process the multi-region information.

Can you let me know which software was used to generate the vcf file and an example of the file "BE3_sub1.avinput" (first few non-comment lines should be enough)?

moshl commented 2 years ago

Hm, the issue seems to be in reading in the Annovar-formatted vcf file to then process the multi-region information.

Can you let me know which software was used to generate the vcf file and an example of the file "BE3_sub1.avinput" (first few non-comment lines should be enough)?

Hi elakatos. I also meet the same problem when I run the codes as below: python2 $NeoPredPipe/NeoPredPipe.py -I $VCFDIR/ -H $WORKDIR/CRC.hla.txt -o $WORKDIR/out -n CRC -m -c 1 2 -E 8 9 10 11 12

  1. Error Traceback (most recent call last): File "/opt/software/NeoPredPipe/NeoPredPipe.py", line 524, in main() File "/opt/software/NeoPredPipe/NeoPredPipe.py", line 505, in main t.append(Sample(localpath, patname, patFile, hlas[patname], annPaths, netMHCpanPaths, pepmatchPaths, Options)) File "/opt/software/NeoPredPipe/NeoPredPipe.py", line 108, in init self.digestIndSample(FilePath, pepmatchPaths, Options) File "/opt/software/NeoPredPipe/NeoPredPipe.py", line 178, in digestIndSample self.appendedEpitopes, self.regionsPresent = AppendDigestedEps(FilePath, self.digestedEpitopes, self.patID, self.annotationReady, self.avReadyFile, Options) File "/opt/software/NeoPredPipe/postprocessing.py", line 175, in AppendDigestedEps epID = int(ep.split('\t')[10].split('_')[0].replace('line','')) ValueError: invalid literal for int() with base 10: 'rm:'

  2. The vcf files are generated by Mutect2. The format of *.avinput as below:

    fileformat=VCFv4.2

    FILTER=

    FILTER=

    FILTER=

    FILTER=

    normal_sample=B002N

    source=FilterMutectCalls

    source=Mutect2

    tumor_sample=B002P

    tumor_sample=B002T

    CHROM POS ID REF ALT QUAL FILTER INFO FORMAT B002N

    chr1 1054001 1054001 G A 0.3333 . 68 chr1 1054001 . G A . PASS AS_FilterStatus=SITE;AS_SB_TABLE=40,117|6,15;DP=188;ECNT=1;GERMQ=93;MBQ=37,26;MFRL=252,202;MMQ=60,60;MPOS=22;NALOD=1.48;NLOD=8.43;POPAF=3.91;TLOD=45.75 GT:AD:AF:DP:F1R2:F2R1:SB 0/0:35,0:0.032:35:19,0:16,0:8,27,0,0 0/1:54,21:0.265:75:31,5:23,15:12,42,6,15 0/1:68,0:0.016:68:30,0:38,0:20,48,0,0 chr1 1629099 1629099 G A 0.3333 . 20 chr1 1629099 . G A . PASS AS_FilterStatus=SITE;AS_SB_TABLE=44,13|5,3;DP=69;ECNT=1;GERMQ=92;MBQ=29,28;MFRL=238,214;MMQ=60,60;MPOS=46;NALOD=1.09;NLOD=3.31;POPAF=6.00;TLOD=17.71 GT:AD:AF:DP:F1R2:F2R1:SB 0/0:12,0:0.075:12:7,0:5,0:10,2,0,0 0/1:25,8:0.241:33:11,4:14,4:17,8,5,3 0/1:20,0:0.048:20:10,0:10,0:17,3,0,0 chr1 3531070 3531070 C T 0.3333 . 34 chr1 3531070 . C T . PASS AS_FilterStatus=SITE;AS_SB_TABLE=40,53|3,3;DP=99;ECNT=1;GERMQ=93;MBQ=29,38;MFRL=227,306;MMQ=60,60;MPOS=27;NALOD=1.15;NLOD=3.77;POPAF=6.00;TLOD=5.36 GT:AD:AF:DP:F1R2:F2R1:SB 0/0:14,1:0.067:15:9,0:4,0:3,11,0,1 0/1:47,3:0.052:50:16,0:31,2:21,26,2,1 0/1:32,2:0.032:34:20,0:12,0:16,16,1,1

3. set environment

[annovar] convert2annovar = /opt/software/annovar/convert2annovar.pl annotatevariation = /opt/software/annovar/annotate_variation.pl coding_change = /opt/software/annovar/coding_change.pl gene_table = /opt/software/annovar/humandb/hg38_refGene.txt gene_fasta = /opt/software/annovar/humandb/hg38_refGeneMrna.fa humandb = /opt/software/annovar/humandb [netMHCpan] netMHCpan = /opt/src/netMHCpan-4.0/netMHCpan [PeptideMatch] peptidematch_jar = /opt/software/NeoPredPipe/PeptideMatchCMD_1.0.jar reference_index = /opt/software/NeoPredPipe/protein_db/uniprot_index/ [blast] blastp = /opt/src/ncbi-blast-2.11.0+/bin/blastp

Looking forward your reply !

Bestwishes,

mo