dib-lab / dammit

just annotate it, dammit!
http://dib-lab.github.io/dammit/
Other
88 stars 28 forks source link

Some tasks failed![dammit.annotate:ERROR] - hmmscan:longest_orfs.pep.x.Pfam-A.hmm #64

Closed saabalde closed 7 years ago

saabalde commented 8 years ago

Hi,

I'm experiencing somekind of problem when running dammit. I've check all dependencies and databases and everything seems to be allright. I'm running the tutorial, just to check everything is fine, and I get this output:

========================================
dammit! a tool for easy de novo transcriptome annotation
Camille Scott 2015
========================================

submodule: annotate


--- Checking PATH for dependencies

          [x] HMMER

          [x] Infernal

          [x] BLAST+

          [x] BUSCO

          [x] TransDecoder

          [x] LAST

          [x] crb-blast

--- Dependency results

          All dependencies satisfied!

--- Checking for database prep (dir: /gpfs/csic_users/saabalde/scratch/anaconda2_db)

          [x] download_and_gunzip:Pfam-A.hmm

          [x] hmmpress:Pfam-A.hmm

          [x] download_and_gunzip:Rfam.cm

          [x] cmpress:Rfam.cm

          [x] download_and_gunzip:aa_seq_euk.fasta

          [x] lastdb:aa_seq_euk.fasta.db

          [x] download_and_gunzip:ODB8_EukOGs_genes_ALL_levels.txt

          [x] download_and_untar:anaconda2_db-eukaryota

--- Database results

          All databases prepared!

--- Running annotate!

          Transcriptome file: cdna_nointrons_utrs.fa

          Output directory: /gpfs/res_scratch/cvcv/saabalde/dammit_test/front-
          end/cdna_nointrons_utrs.fa.dammit

          [x] cdna_nointrons_utrs.fa

          [x] transcriptome_stats:cdna_nointrons_utrs.fa

          [x] busco:cdna_nointrons_utrs.fa-eukaryota

          [x] TransDecoder.LongOrfs:cdna_nointrons_utrs.fa

          [ ] hmmscan:longest_orfs.pep.x.Pfam-A.hmm

Some tasks failed![dammit.annotate:ERROR]
TaskFailed - taskid:hmmscan:longest_orfs.pep.x.Pfam-A.hmm[dammit.annotate:ERROR]
Command failed: 'S=`stat -c "%s" cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep`; B=`expr $S / 1`; cat cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep | /gpfs/csic_users/saabalde/local/bib/active/bin/parallel --block $B --pipe --recstart ">" --gnu -j 1 /gpfs/csic_users/saabalde/hmmer-3.1b2-linux-intel-x86_64/binaries/hmmscan --cpu 1 --domtblout /dev/stdout -E 1e-05 -o cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep.pfam.tbl.out /gpfs/csic_users/saabalde/scratch/anaconda2_db/Pfam-A.hmm /dev/stdin > cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep.pfam.tbl' returned 1
[dammit.annotate:ERROR]

I've tried to run the --debug option but it doesn't return any different information. Is there any way to get more information from the error message, any report document or something?

By the way, I've installed dammit on a Anaconda2 environment. I imagine this is not a problem, since using Anaconda is recommended in the Installation guide.

All help is wellcome. I don't know what else I could try. Many thanks, Samu

saabalde commented 8 years ago

[Update] If I edit the input file (remember: the one on the tutorial), I don't get stuck on that point. If I just run the first sequence or the first 100 sequences (removing the extra line between sequences) I get:

========================================
dammit! a tool for easy de novo transcriptome annotation
Camille Scott 2015
========================================

submodule: annotate


--- Dependency results

          All dependencies satisfied!

--- Database results

          All databases prepared!

--- Running annotate!

          Transcriptome file: cdna_nointrons_utrs.100.fa

          Output directory: /gpfs/res_scratch/cvcv/saabalde/dammit_test/front-
          end/cdna_nointrons_utrs.100.fa.dammit

          [x] cdna_nointrons_utrs.100.fa

          [x] transcriptome_stats:cdna_nointrons_utrs.100.fa

          [x] busco:cdna_nointrons_utrs.100.fa-eukaryota

          [x] TransDecoder.LongOrfs:cdna_nointrons_utrs.100.fa

          [x] hmmscan:longest_orfs.pep.x.Pfam-A.hmm

          [x] remap_hmmer:longest_orfs.pep.pfam.tbl

          [x] TransDecoder.Predict:cdna_nointrons_utrs.100.fa

          [ ] cmscan:cdna_nointrons_utrs.100.fa.x.Rfam.cm

          [x] lastal:cdna_nointrons_utrs.100.fa.x.orthodb.maf

          [x] sanitize_fasta:pep.fa

          [x] crb-blast:cdna_nointrons_utrs.100.fa.x.pep.fa

          [x] maf_best_hits:cdna_nointrons_utrs.100.fa.x.orthodb.maf-
          cdna_nointrons_utrs.100.fa.x.orthodb.maf.best.csv

          [x] maf-gff3:cdna_nointrons_utrs.100.fa.x.orthodb.maf.gff3

          [x] hmmscan-gff3:cdna_nointrons_utrs.100.fa.pfam.csv.gff3

          [ ] cmscan-gff3:cdna_nointrons_utrs.100.fa.rfam.tbl.gff3

          [x] crbb-gff3:cdna_nointrons_utrs.100.fa.x.pep.fa.crbb.tsv.gff3

          [ ] gff3-merge:cdna_nointrons_utrs.100.fa.dammit.gff3

          [ ] fasta-annotate:cdna_nointrons_utrs.100.fa.dammit.fasta

But this is another problem. If I run the whole input file (with the 5000 sequences) and, again, without an extra line between sequences, I keep getting stuck on the same point:

--- Running annotate!

          Transcriptome file: cdna_nointrons_utrs.fa

          Output directory: /gpfs/res_scratch/cvcv/saabalde/dammit_test/front-
          end/cdna_nointrons_utrs.fa.dammit

          [ ] cdna_nointrons_utrs.fa

          [ ] transcriptome_stats:cdna_nointrons_utrs.fa

          [ ] busco:cdna_nointrons_utrs.fa-eukaryota

          [ ] TransDecoder.LongOrfs:cdna_nointrons_utrs.fa

          [ ] hmmscan:longest_orfs.pep.x.Pfam-A.hmm

Some tasks failed![dammit.annotate:ERROR]
TaskFailed - taskid:hmmscan:longest_orfs.pep.x.Pfam-A.hmm[dammit.annotate:ERROR]
Command failed: 'S=`stat -c "%s" cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep`; B=`expr $S / 1`; cat cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep | /gpfs/csic_users/saabalde/local/bib/active/bin/parallel --block $B --pipe --recstart ">" --gnu -j 1 /gpfs/csic_users/saabalde/hmmer-3.1b2-linux-intel-x86_64/binaries/hmmscan --cpu 1 --domtblout /dev/stdout -E 1e-05 -o cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep.pfam.tbl.out /gpfs/csic_users/saabalde/scratch/anaconda2_db/Pfam-A.hmm /dev/stdin > cdna_nointrons_utrs.fa.transdecoder_dir/longest_orfs.pep.pfam.tbl' returned 1
[dammit.annotate:ERROR]

I don't know what else I can do.

saabalde commented 8 years ago

[UPDATE]

dammit! is working already. The whole problem was the extra-line between sequences. I was getting the same error when running the whole file (instead of the 100-sequences one) beacuse of the runtime limit of the cluster. After running it on the queue, the process finishes correctly.

--- Running annotate!

          Transcriptome file: cdna_nointrons_utrs.fa

          Output directory: /gpfs/res_scratch/cvcv/saabalde/dammit_test/queue/cdna_nointro
          ns_utrs.fa.dammit

          [x] cdna_nointrons_utrs.fa

          [x] transcriptome_stats:cdna_nointrons_utrs.fa

          [x] busco:cdna_nointrons_utrs.fa-eukaryota

          [x] TransDecoder.LongOrfs:cdna_nointrons_utrs.fa

          [x] hmmscan:longest_orfs.pep.x.Pfam-A.hmm

          [x] remap_hmmer:longest_orfs.pep.pfam.tbl

          [x] TransDecoder.Predict:cdna_nointrons_utrs.fa

          [ ] cmscan:cdna_nointrons_utrs.fa.x.Rfam.cm

          [x] lastal:cdna_nointrons_utrs.fa.x.orthodb.maf

          [x] sanitize_fasta:pep.fa

          [x] crb-blast:cdna_nointrons_utrs.fa.x.pep.fa

          [x] maf_best_hits:cdna_nointrons_utrs.fa.x.orthodb.maf-
          cdna_nointrons_utrs.fa.x.orthodb.maf.best.csv

          [x] maf-gff3:cdna_nointrons_utrs.fa.x.orthodb.maf.gff3

          [x] hmmscan-gff3:cdna_nointrons_utrs.fa.pfam.csv.gff3

          [ ] cmscan-gff3:cdna_nointrons_utrs.fa.rfam.tbl.gff3

          [x] crbb-gff3:cdna_nointrons_utrs.fa.x.pep.fa.crbb.tsv.gff3

          [x] gff3-merge:cdna_nointrons_utrs.fa.dammit.gff3

          [x] fasta-annotate:cdna_nointrons_utrs.fa.dammit.fasta

The only problem I have now is cmscan. The pipeline skip that step, but it doesn't fail. This is another problem, anyways...

camillescott commented 8 years ago

Sorry for the lack of response :) I believe the cmscan issue is related to a bug in the cmscan task's targets, which was fixed in the latest minor version update. An upcoming version is also going to do much more aggressive sanitization of the input file to remove these sorts of errors, so that should help as well.

saabalde commented 8 years ago

Yes, I saw that commit on the "code" section. I tried to update dammit after that, but it didn't work, so I'm gonna wait until the next version. Thank you very much for your reply and don't worry for the long time. I could use dammit!, anyways

camillescott commented 7 years ago

(fixed in 1.0 beta; closing)