MathOnco / NeoPredPipe

Neoantigens prediction pipeline for multi- or single-region vcf files using ANNOVAR and netMHCpan.
GNU Lesser General Public License v3.0
100 stars 28 forks source link

AttributeError: Sample instance has no attribute 'appendedEpitopesIndels' #18

Closed ravichas closed 3 years ago

ravichas commented 4 years ago

Hello NeoPredPipe developers:

I am using the latest version of NeoPredPipe. I cloned the repo two days ago.

I am using Python 2.7.15 and I am on a Linux box
(Linux version 3.10.0-862.14.4.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Wed Sep 26 15:12:11 UTC 2018)

When I run the following command, I get an AttributeError.

sravi@node1 NeoPredPipe]$ python NeoPredPipe.py -I ./Example/input_vcfs -H ./Example/HLAtypes/hlatypes.txt -o ./ -n TestRun -c 1 2 -E 8 9 10 INFO: Annovar reference files of build hg19 were given, using this build for all analysis. INFO: Begin. INFO: Proper directory already exists. Continue. INFO: Proper directory already exists. Continue. INFO: Proper directory already exists. Continue. INFO: Proper directory already exists. Continue. INFO: Running convert2annovar.py on ./Example/input_vcfs/test2.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test2.vcf INFO: Running annotate_variation.pl on ./avready/test2.avinput INFO: ANNOVAR annotation Process complete for ./avready/test2.avinput INFO: Coding change fasta files for test2 already present. INFO: Coding change fasta files test2 has already been reformatted. INFO: Tmp fasta files test2 has already been created for netMHCpan length 8. INFO: Tmp fasta files test2 has already been created for netMHCpan length 9. INFO: Tmp fasta files test2 has already been created for netMHCpan length 10. INFO: Predicting neoantigens for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Predictions complete for test2 on epitopes of length 10.Indels INFO: Running convert2annovar.py on ./Example/input_vcfs/test1.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test1.vcf INFO: Running annotate_variation.pl on ./avready/test1.avinput INFO: ANNOVAR annotation Process complete for ./avready/test1.avinput INFO: Coding change fasta files for test1 already present. INFO: Coding change fasta files test1 has already been reformatted. INFO: Tmp fasta files test1 has already been created for netMHCpan length 8. INFO: Tmp fasta files test1 has already been created for netMHCpan length 9. INFO: Tmp fasta files test1 has already been created for netMHCpan length 10. INFO: Predicting neoantigens for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Predictions complete for test1 on epitopes of length 10.Indels INFO: Summary Tables Complete. Traceback (most recent call last): File "NeoPredPipe.py", line 524, in main() File "NeoPredPipe.py", line 516, in main FinalOut(t, Options, True) File "NeoPredPipe.py", line 278, in FinalOut appendedEps = getattr(sampleClasses[i],epitopesToProcess) AttributeError: Sample instance has no attribute 'appendedEpitopesIndels'

Any help would be greatly appreciated. Thanks ravi

elakatos commented 4 years ago

Hi Ravi,

This issue can come up when the prediction table is completely empty, as it seems to be the case based on all the "Skipping sample" lines. However, it shouldn't be the case for the vcfs in the Example folder... This line "INFO: Tmp fasta files test2 has already been created for netMHCpan length 8." seems to suggest to me that there was an aborted run before this particular one, and maybe leftover files from that, which are erroneous, are interfering now. Can you delete all intermediate folders (avready, avannotated, fastaFiles, tmp), or all test1/test2 files in them, and run the same command again? Can you let me know if it fixes the issue?

Eszter

ravichas commented 4 years ago

Hello Eszter: Thanks for your help.

I repeated the run by deleting all the intermediate folders. I still see the same error.

Here is my *ini file (I am not using PeptideMatch) _

[ravichandrans@cn3131 NeoPredPipe]$ cat usr_paths.ini

[annovar] convert2annovar = $ANNOVAR_HOME/convert2annovar.pl annotatevariation = $ANNOVAR_HOME/annotate_variation.pl coding_change = $ANNOVAR_HOME/coding_change.pl gene_table = $ANNOVAR_HOME/humandb/hg19_refGene.txt gene_fasta = $ANNOVAR_HOME/humandb/hg19_refGeneMrna.fa humandb = $ANNOVAR_HOME/humandb [netMHCpan] netMHCpan = /data/ravichandrans/netMHCpan-4.0/Linux_x86_64/bin/netMHCpan [PeptideMatch] peptidematch_jar = /your/path/to/PeptideMatchCMD_1.0.jar reference_index = /your/path/to/protein_database/index/ [blast] blastp = /usr/local/apps/blast/ncbi-blast-2.10.0+/bin/blastp

Here are the commands and the output:

Currently Loaded Modules: 1) python/2.7 2) blast/2.10.0+ 3) annovar/2018-04-16

[ravichandrans@cn3131 NeoPredPipe]$ python NeoPredPipe.py -I ./Example/input_vcfs -H ./Example/HLAtypes/hlatypes.txt -o ./ -n TestRun -c 1 2 -E 8 9 10 INFO: Annovar reference files of build hg19 were given, using this build for all analysis. INFO: Begin. INFO: Running convert2annovar.py on ./Example/input_vcfs/test2.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test2.vcf INFO: Running annotate_variation.pl on ./avready/test2.avinput INFO: ANNOVAR annotation Process complete for ./avready/test2.avinput INFO: Running coding_change.pl on ./avannotated/test2.avannotated.exonic_variant_function INFO: Coding predictions complete for ./avannotated/test2.avannotated.exonic_variant_function INFO: Predicting neoantigens for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Predictions complete for test2 on epitopes of length 10.Indels INFO: Running convert2annovar.py on ./Example/input_vcfs/test1.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test1.vcf INFO: Running annotate_variation.pl on ./avready/test1.avinput INFO: ANNOVAR annotation Process complete for ./avready/test1.avinput INFO: Running coding_change.pl on ./avannotated/test1.avannotated.exonic_variant_function INFO: Coding predictions complete for ./avannotated/test1.avannotated.exonic_variant_function INFO: Predicting neoantigens for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Predictions complete for test1 on epitopes of length 10.Indels INFO: Summary Tables Complete. Traceback (most recent call last): File "NeoPredPipe.py", line 524, in main() File "NeoPredPipe.py", line 516, in main FinalOut(t, Options, True) File "NeoPredPipe.py", line 278, in FinalOut appendedEps = getattr(sampleClasses[i],epitopesToProcess) AttributeError: Sample instance has no attribute 'appendedEpitopesIndels'

I am not sure whether I have followed all the instructions. Can you tell me how Biopython is used in the pipeline?

Thanks again for your help.

Ravi

elakatos commented 4 years ago

Hi! It looks like that annovar processing of the files is problematic. I think the problem is with this part of the ini file "$ANNOVAR_HOME/convert2annovar.pl", as it is processed by python, and I'm afraid the system variable $ANNOVAR_HOME is not handled properly. I expect .avinput, .avannotated and .fasta files to be empty because of that. Can you try changing the ini file to provide the full path (without variable) to annovar and let me know if it fixes the issue?

Biopython is used in processing the temporary fasta files produced in the coding_change step.

Eszter

ravichas commented 4 years ago

Thanks again for your help. Do I need to install Biophython? Do you have installation instructions for Biophython? I ran it by providing explicit path in *ini file but I dont have Biophython. I am getting similar error.

Thanks

here is the ini

-------------------------

[annovar] convert2annovar = /usr/local/apps/annovar/2018-04-16/convert2annovar.pl annotatevariation = /usr/local/apps/annovar/2018-04-16/annotate_variation.pl coding_change = /usr/local/apps/annovar/2018-04-16/coding_change.pl gene_table = /usr/local/apps/annovar/2018-04-16/humandb/hg19_refGene.txt gene_fasta = /usr/local/apps/annovar/2018-04-16/humandb/hg19_refGeneMrna.fa humandb = /uar/local/apps/annovar/2018-04-16/humandb [netMHCpan] netMHCpan = /data/ravi/netMHCpan-4.0/Linux_x86_64/bin/netMHCpan [PeptideMatch] peptidematch_jar = /your/path/to/PeptideMatchCMD_1.0.jar reference_index = /your/path/to/protein_database/index/ [blast] blastp = /usr/local/apps/blast/ncbi-blast-2.10.0+/bin/blastp

[ravi@cn3607 NeoPredPipe]$ module load annovar/2018-04-16 [+] Loading annovar 2018-04-16 on cn3607 [ravi@cn3607 NeoPredPipe]$ [ravi@cn3607 NeoPredPipe]$ rm -fr tmp avready avannotated fastaFiles [ravi@cn3607 NeoPredPipe]$ python NeoPredPipe.py -I ./Example/input_vc fs -H ./Example/HLAtypes/hlatypes.txt -o ./ -n TestRun -c 1 2 -E 8 9 10 INFO: Annovar reference files of build hg19 were given, using this build for all analysis. INFO: Begin. INFO: Running convert2annovar.py on ./Example/input_vcfs/test2.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test2.vcf INFO: Running annotate_variation.pl on ./avready/test2.avinput INFO: ANNOVAR annotation Process complete for ./avready/test2.avinput INFO: Running coding_change.pl on ./avannotated/test2.avannotated.exonic_variant _function INFO: Coding predictions complete for ./avannotated/test2.avannotated.exonic_var iant_function INFO: Predicting neoantigens for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Predictions complete for test2 on epitopes of length 10.Indels INFO: Running convert2annovar.py on ./Example/input_vcfs/test1.vcf INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test1.vcf INFO: Running annotate_variation.pl on ./avready/test1.avinput INFO: ANNOVAR annotation Process complete for ./avready/test1.avinput INFO: Running coding_change.pl on ./avannotated/test1.avannotated.exonic_variant _function INFO: Coding predictions complete for ./avannotated/test1.avannotated.exonic_var iant_function INFO: Predicting neoantigens for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Skipping Sample! No peptides to predict for test1 INFO: Predictions complete for test1 on epitopes of length 10.Indels INFO: Summary Tables Complete. Traceback (most recent call last): File "NeoPredPipe.py", line 524, in main() File "NeoPredPipe.py", line 516, in main FinalOut(t, Options, True) File "NeoPredPipe.py", line 278, in FinalOut appendedEps = getattr(sampleClasses[i],epitopesToProcess) AttributeError: Sample instance has no attribute 'appendedEpitopesIndels'

elakatos commented 4 years ago

Hi! Actually, I double checked and test1.vcf has indeed no exonic mutations, and because of that it throws an error. I will look into this and update the test files to make sure they are correct!

However, it still looks to me that something is off with test2, as for me it does produce some neoantigens (info output lines below), while for you it seems like there are no mutations in that sample at all. Can you try running the same command after removing test1.vcf (so that it only runs for test2)? If there's still an issue, can you check how the temporary fasta files test2.tmp.*.fasta look like?

Are you sure you don't have Biopython pre-installed with your python distribution? (If you go into the interactive python command line and try "import Bio", does it return an error?) I would expect an error thrown when running NeoPredPipe, because we do import Biopython in vcf_manipulate.py.


My output for test2: INFO: Predicting neoantigens for test2 INFO: Skipping Sample! No peptides to predict for test2 INFO: Running Epitope Predictions for test2 on epitopes of length 9 INFO: Running Epitope Predictions for test2 on epitopes of length 8 INFO: Skipping Sample! No peptides to predict for test2 INFO: Running Epitope Predictions for test2 on epitopes of length 10 INFO: Skipping Sample! No peptides to predict for test2 INFO: Predictions complete for test2 on epitopes of length 10.Indels INFO: Digesting neoantigens for test2 INFO: Digesting neoantigens for test2 INFO: Digesting neoantigens for test2 INFO: Object size of neoantigens: 776 Kb

ravichas commented 4 years ago

1) I am sorry. Our Python has Biophython library.

2) Removed test1.vcf file 
NeoPredPipe/Example/input_vcfs/test1.vcf

[ravi@cn3607 NeoPredPipe]$ ls -l Example/input_vcfs/
total 128
-rw-rw-r-- 1 ravi ravi 12475 Jun 11 12:06 test2.vcf

3) Ran the follwoing command:

[ravi@cn3607 NeoPredPipe]$  python NeoPredPipe.py -I ./Example/input_vc                                                                       fs -H ./Example/HLAtypes/hlatypes.txt -o ./ -n TestRun -c 1 2 -E 8 9 10
INFO: Annovar reference files of build hg19 were given, using this build for all                                                                        analysis.
INFO: Begin.
INFO: Running convert2annovar.py on ./Example/input_vcfs/test2.vcf
INFO: ANNOVAR VCF Conversion Process complete ./Example/input_vcfs/test2.vcf
INFO: Running annotate_variation.pl on ./avready/test2.avinput
INFO: ANNOVAR annotation Process complete for ./avready/test2.avinput
INFO: Running coding_change.pl on ./avannotated/test2.avannotated.exonic_variant                                                                       _function
INFO: Coding predictions complete for ./avannotated/test2.avannotated.exonic_var                                                                       iant_function
INFO: Predicting neoantigens for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Predictions complete for test2 on epitopes of length 10.Indels
WARNING: Not all samples in HLA file have matching VCF files. Please check that                                                                        HLA file is tab-separated and sample names match exactly with .vcf file names. O                                                                       nly matching samples will be included in analysis and output tables.
INFO: Summary Tables Complete.
Traceback (most recent call last):
  File "NeoPredPipe.py", line 524, in <module>
    main()
  File "NeoPredPipe.py", line 516, in main
    FinalOut(t, Options, True)
  File "NeoPredPipe.py", line 278, in FinalOut
    appendedEps = getattr(sampleClasses[i],epitopesToProcess)
AttributeError: Sample instance has no attribute 'appendedEpitopesIndels'

4) They are all zero bytes

[ravi@cn3607 NeoPredPipe]$ ls -ltr fastaFiles/*
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.reformat.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.8.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.8.Indels.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.9.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.9.Indels.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.10.fasta
-rw-rw-r-- 1 ravi ravi 0 Jun 17 13:02 fastaFiles/test2.tmp.10.Indels.fasta

Thanks again for your help.

`

elakatos commented 4 years ago

Seems like there are no lines at all in the corresponding fasta file, which could either be because there are no exonic mutations, or because Annovar did not run correctly. Is there anything in the test2.avinput and the test2.avannotated* files? There is only a single mutation in test2.avannotated.exonic_variant_function, and I guess using refGene fetched at a different date might cause it to not show up as exonic (see the line below for which mutation). So if the exonic variant file is empty, but the others not, it's the case of different version, and probably your installation is fine.

I've just updated test1.vcf, so if you download the newest version, it should have more exonic variants that produce neoantigens.


line56 nonsynonymous SNV ATRIP:NM_001271023:exon7:c.G759T:p.Q253H,ATRIP:NM_130384:exon7:c.G1038T:p.Q346H,ATRIP:NM_032166:exon7:c.G1038T:p.Q346H,ATRIP:NM_001271022:exon8:c.G657T:p.Q219H, chr3 48459899 48459899 C T 0.3333 13.3244 24 chr3 48459899 . C T 13.3244 PASS NS=3;DISTR=|C|CT|CT|;SB=1.0000 GT:A:GQ:SS:BCOUNT:DP 0/0:C:100.0000:0:0,34,0,0:34 0/1:CT:100.0000:2:0,21,0,4:25 0/1:CT:13.3244:2:0,22,0,2:24

ravichas commented 4 years ago

I recloned the repo.

Here are my loaded modules

Currently Loaded Modules:
  1) python/2.7   2) blast/2.10.0+   3) annovar/2018-04-16

My anovar setup

[ravi@cn3607 NeoPredPipe]$ echo $ANNOVAR_HOME
/usr/local/apps/annovar/2018-04-16
[ravi@cn3607 NeoPredPipe]$ ls -l $ANNOVAR_HOME
total 528
-rwxr-xr-x 1 hooverdm staff 221481 Apr 16  2018 annotate_variation.pl
-rwxr-xr-x 1 hooverdm staff  27582 Apr 16  2018 coding_change.pl
-rwxr-xr-x 1 hooverdm staff 170158 Apr 16  2018 convert2annovar.pl
drwxr-xr-x 2 hooverdm staff   4096 Sep 16  2019 example
drwxr-xr-x 3 hooverdm staff   4096 Apr 16  2018 humandb
-rwxr-xr-x 1 hooverdm staff  19407 Apr 16  2018 retrieve_seq_from_fasta.pl
-rwxr-xr-x 1 hooverdm staff  39223 Apr 16  2018 table_annovar.pl
-rwxr-xr-x 1 hooverdm staff  21774 Apr 16  2018 variants_reduction.pl

Here is my output

[ravi@cn3607 NeoPredPipe]$  python NeoPredPipe.py -I ./Example/input_vcfs -H ./Example/HLAtypes/hlatypes.txt -o ./ -n TestRun -c 1 2 -E 8 9 10
INFO: Annovar reference files of build hg19 were given, using this build for all analysis.
INFO: Begin.
INFO: Proper directory already exists. Continue.
INFO: Proper directory already exists. Continue.
INFO: Proper directory already exists. Continue.
INFO: Proper directory already exists. Continue.
INFO: ANNOVAR Ready files for test2 already present.
INFO: ANNOVAR Annotation files for test2 already present.
INFO: Coding change fasta files for test2 already present.
INFO: Coding change fasta files test2 has already been reformatted.
INFO: Tmp fasta files test2 has already been created for netMHCpan length 8.
INFO: Tmp fasta files test2 has already been created for netMHCpan length 9.
INFO: Tmp fasta files test2 has already been created for netMHCpan length 10.
INFO: Predicting neoantigens for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Skipping Sample! No peptides to predict for test2
INFO: Predictions complete for test2 on epitopes of length 10.Indels
INFO: ANNOVAR Ready files for test1 already present.
INFO: ANNOVAR Annotation files for test1 already present.
INFO: Coding change fasta files for test1 already present.
INFO: Coding change fasta files test1 has already been reformatted.
INFO: Tmp fasta files test1 has already been created for netMHCpan length 8.
INFO: Tmp fasta files test1 has already been created for netMHCpan length 9.
INFO: Tmp fasta files test1 has already been created for netMHCpan length 10.
INFO: Predicting neoantigens for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Skipping Sample! No peptides to predict for test1
INFO: Predictions complete for test1 on epitopes of length 10.Indels
INFO: Summary Tables Complete.
Traceback (most recent call last):
  File "NeoPredPipe.py", line 524, in <module>
    main()
  File "NeoPredPipe.py", line 516, in main
    FinalOut(t, Options, True)
  File "NeoPredPipe.py", line 278, in FinalOut
    appendedEps = getattr(sampleClasses[i],epitopesToProcess)
AttributeError: Sample instance has no attribute 'appendedEpitopesIndels'

My fastaFiles are still empty.

Thanks very much for all your help.

Ravi

elakatos commented 4 years ago

"ANNOVAR Annotation files for test1 already present." This suggests that even though you've updated test1.vcf, it was actually not re-run in the annovar analysis, so first try after deleting the temporary directories. If test1 gets fixed, then the problem was with not having exonic variants before.

If the problem persist that would suggest that the issue is not with our pipeline but with Annovar. You can also check if the .avinput and .avannotated* files have information in them (and what is that) which might highlight which step of Annovar has the issue. You should also have a logforannovarNeoPredPipe.txt file, that might have useful messages in it - you can make sure this file is not deleted even in successful runs by specifying the '-l' option. I would also suggest confirming that Annovar is set up properly by going through their start-up examples (https://doc-openbio.readthedocs.io/projects/annovar/en/latest/user-guide/startup/).

ravichas commented 4 years ago

Thank you for your patience. I will check the ANNOVAR pipleline from our end. I will keep you updated.