dib-lab / dammit

just annotate it, dammit!
http://dib-lab.github.io/dammit/
Other
88 stars 28 forks source link

TaskError - IndexError: list index out of range #134

Open johnsolk opened 5 years ago

johnsolk commented 5 years ago

With conda install dammit version on hpc bridges, only 1 custom aa database:

[ljcohen@br018 sbatch_files]$ cat dammit_L_parva-4642625.o
# dammit
## a tool for easy de novo transcriptome annotation

by Camille Scott

**v1.0rc2**, 2018

## submodule: annotate
### Database Check
#### Info
* Database Directory: /pylon5/bi5fpmp/ljcohen/dammit
* Doit Database: /pylon5/bi5fpmp/ljcohen/dammit/databases.doit.db

*All database tasks up-to-date.*

### Annotation
#### Info
* Doit Database: /local/4642625/L_parva/L_parva.dammit/annotate.doit.db
* Input Transcriptome: /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta

Some tasks out of date!

Out-of-date tasks:
* BUSCO-eukaryota
* TransDecoder.LongOrfs
* TransDecoder.Predict
* annotate:fasta
* cmscan:Rfam
* gff3:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa.crbl.csv
* gff3:OrthoDB
* gff3:Pfam-A
* gff3:Rfam
* gff3:merge-all
* hmmscan:Pfam-A
* hmmscan:Pfam-A:remap
* lastal:OrthoDB
* lastal:best-hits:OrthoDB
* rename-transcriptome
* transcriptome-stats
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-fit_and_filter_crbl_hits
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-lastal:.Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa.x.L_parva.trinity_out.Trinity.fasta.pep.maf
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-lastal:.L_parva.trinity_out.Trinity.fasta.pep.x.Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa.maf
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-lastdb:.Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-lastdb:.L_parva.trinity_out.Trinity.fasta.pep
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-rename:/local/4642625/L_parva/Fhet_reference_genome/ensembl/Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-rename:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta
* user-database:Fundulus_heteroclitus.Fundulus_heteroclitus-3.0.2.pep.all.fa-shmlast-translate:.L_parva.trinity_out.Trinity.fasta

#### Run Tasks
- [ ] L_parva.trinity_out.Trinity.fasta: 
    * Python: function get_rename_transcriptome_task.fix
- [ ] transcriptome_stats:L_parva.trinity_out.Trinity.fasta: 
    * Python: function get_transcriptome_stats_task.cmd
- [ ] busco:L_parva.trinity_out.Trinity.fasta-eukaryota_odb9: 
    * Cmd: `python3 /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/run_BUSCO.py -i /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta -f -o L_parva.trinity_out.Trinity.fasta.eukaryota.busco.results -l /pylon5/bi5fpmp/ljcohen/dammit/busco2db/eukaryota_odb9 -m tran -c 14`
- [ ] TransDecoder.LongOrfs:L_parva.trinity_out.Trinity.fasta: 
    * Cmd: `/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/TransDecoder.LongOrfs -t /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta -m 80`
- [ ] hmmscan:longest_orfs.pep.x.Pfam-A.hmm: 
    * Cmd: `cat /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.transdecoder_dir/longest_orfs.pep | /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/parallel --round-robin --pipe -L 2 -N 10000 --gnu -j 14 -a /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.transdecoder_dir/longest_orfs.pep /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/hmmscan --cpu 1 --domtblout /dev/stdout -E 1e-05 -o /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.transdecoder_dir/longest_orfs.pep.x.pfam.tbl.hmmscan.out /pylon5/bi5fpmp/ljcohen/dammit/Pfam-A.hmm /dev/stdin > /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.transdecoder_dir/longest_orfs.pep.x.pfam.tbl`
- [ ] remap_hmmer:longest_orfs.pep.x.pfam.tbl: 
    * Python: function get_remap_hmmer_task.cmd
- [ ] hmmscan-gff3:L_parva.trinity_out.Trinity.fasta.x.pfam-A.gff3: 
    * Cmd: `rm -f /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.pfam-A.gff3`
    * Python: function get_hmmscan_gff3_task.cmd
- [ ] TransDecoder.Predict:L_parva.trinity_out.Trinity.fasta: 
    * Cmd: `/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/TransDecoder.Predict -t /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta --retain_pfam_hits /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.transdecoder_dir/longest_orfs.pep.x.pfam.tbl`
- [ ] cmscan:L_parva.trinity_out.Trinity.fasta.x.Rfam.cm: 
    * Cmd: `cat /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta | /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/parallel --round-robin --pipe -L 2 -N 10000 --gnu -j 14 -a /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/cmscan --cpu 1 --rfam --nohmmonly -E 1e-05 --tblout /dev/stdout -o /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.rfam.tbl.cmscan.out /pylon5/bi5fpmp/ljcohen/dammit/Rfam.cm /dev/stdin > /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.rfam.tbl`
- [ ] cmscan-gff3:L_parva.trinity_out.Trinity.fasta.x.rfam.gff3: 
    * Cmd: `rm -f /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.rfam.gff3`
    * Python: function get_cmscan_gff3_task.cmd
- [ ] lastal:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf: 
    * Python: function setup_profiler.add_profile_actions.start_profiling
    * Cmd: `cat /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta | /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/parallel --block `expr $(wc -c /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta | awk '{print $1}') / 14` --round-robin --pipe --recstart '>' --gnu -j 14 /pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/bin/lastal -F15 -D100000.0 /pylon5/bi5fpmp/ljcohen/dammit/aa_seq_euk.fasta > /local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf`
    * Python: function setup_profiler.add_profile_actions.stop_profiling
- [ ] maf_best_hits:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf-/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.best.csv: 
    * Python: function get_maf_best_hits_task.cmd
TaskError - taskid:maf_best_hits:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf-/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.best.csv
PythonAction Error
Traceback (most recent call last):
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/doit/action.py", line 424, in execute
    returned_value = self.py_callable(*self.args, **kwargs)
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/dammit/tasks/gff.py", line 39, in cmd
    df = MafParser(maf_fn).read()
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/dammit/fileio/base.py", line 79, in read
    return pd.concat(self, ignore_index=True)
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 225, in concat
    copy=copy, sort=sort)
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 256, in __init__
    objs = list(objs)
  File "/pylon5/bi5fpmp/ljcohen/miniconda3/envs/dammit_conda/lib/python3.6/site-packages/dammit/fileio/maf.py", line 80, in __iter__
    cur_aln['s_start'] = int(tokens[2])
IndexError: list index out of range

########################################
TaskError - taskid:maf_best_hits:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf-/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.best.csv
maf_best_hits:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf-/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.best.csv <stderr>:

maf_best_hits:/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.maf-/local/4642625/L_parva/L_parva.dammit/L_parva.trinity_out.Trinity.fasta.x.OrthoDB.best.csv <stdout>:
johnsolk commented 5 years ago

Any idea what this means? The conda version used to work just fine!

Problem is, this was run on a different node then the dir was copied over. If I try to run it again (x16 different species x 3 different aa db), the pipeline starts from the beginning. :(