nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
322 stars 85 forks source link

GeneMark-ES failed: fun/predict_misc/genemark/output/gmhmm.mod file missing #172

Closed KnudNorNielsen closed 6 years ago

KnudNorNielsen commented 6 years ago

Hello,

Running funannotate 1.3.2 (for the first time) I got the following error in the predict step.

funannotate predict -i Funannotate/UCPHDK1.ann.1805025.fa -o fun --species "fusarium graminearum"

[03:17 PM]: Running Diamond pre-filter search
[03:33 PM]: Found 328,356 preliminary alignments
[09:26 PM]: Exonerate finished: found 1,232 alignments
[09:27 PM]: Running GeneMark-ES on assembly
[09:28 PM]: GeneMark-ES failed: fun/predict_misc/genemark/output/gmhmm.mod file missing , please check logfiles.

The genemark/output folder is empty!

Checking the log

[knunie@computerome01 genemark]$ cat gmes.log
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:35 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --reformat_fasta --uppercase --allow_x --letters_per_line 60 --out data/dna.fna --label _dna --trace info/dna.trace --in /home/projects/mg_guests/people/knunie/fun/predict_misc/genome.softmasked.fa  --mask_soft 5000
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:52 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild  --seq data/dna.fna  --allow_x  --stat info/dna.general
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:53 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild  --seq data/dna.fna  --allow_x  --stat_fasta info/dna.multi_fasta
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:54 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild  --seq data/dna.fna  --allow_x  --substring_n_distr info/dna.gap_distr
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:56 2018] /services/tools/genemark-es/4.33/gmes_petap/gc_distr.pl --in data/dna.fna  --out info/dna.gc.csv  --w 1000,8000
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:58 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild  --seq /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/data/dna.fna  --split dna.fa  --max_contig 5000000 --min_contig 50000 --letters_per_line 100 --split_at_n 5000 --split_at_x 5000 --allow_x --x_to_n  --trace ../../info/training.trace
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:00 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq data/training.fna --stat info/training.general --allow_x  --GC_PRECISION 0
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] /services/tools/genemark-es/4.33/gmes_petap/build_mod.pl --cfg /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run.cfg  --section ES_ini --def /services/tools/genemark-es/4.33/gmes_petap/heu_dir/heu_05_gcode_1_gc_53.mod
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] ln -sf  /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_ini/es_ini.mod  run/ini.mod
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] 16 contigs in training
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] ln -sf /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_ini/es_ini.mod  /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_A_1/prev.mod
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] /services/tools/genemark-es/4.33/gmes_petap/parse_set.pl --section ES_A --cfg  /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run.cfg  --v
/services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] error

I have genemark-es/4.33 loaded.

What could cause this error?

Best regards,

Knud Nor Nielsen

hyphaltip commented 6 years ago

I find highly fragmented assemblies can cause self training to fail. I often do a filtered assembly with contigs above 10kb only and train on that (assuming there are enough). Then just copy that trained file to your annotation folder and provide it to the cmdline argument to funannotate for genemark model. I also at times prerun genemark on the masked genome and give the gtf file to Funannotate with cmdline argument.

Jason

Jason E Stajich, PhD Professor and Director, Microbiology Graduate Program Department of Microbiology and Plant Pathology University of California, Riverside http://lab.stajich.org @stajichlab @hyphaltip @zygolife +1 951.827.2363

On May 26, 2018, 10:57 AM -0700, KnudNorNielsen notifications@github.com, wrote:

Running funannotate 1.3.2 (for the first time) I got the following error in the predict step. [03:17 PM]: Running Diamond pre-filter search [03:33 PM]: Found 328,356 preliminary alignments [09:26 PM]: Exonerate finished: found 1,232 alignments [09:27 PM]: Running GeneMark-ES on assembly [09:28 PM]: GeneMark-ES failed: fun/predict_misc/genemark/output/gmhmm.mod file missing , please check logfiles. The genemark/output folder is empty! Checking the log [knunie@computerome01 genemark]$ cat gmes.log /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:35 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --reformat_fasta --uppercase --allow_x --letters_per_line 60 --out data/dna.fna --label _dna --trace info/dna.trace --in /home/projects/mg_guests/people/knunie/fun/predict_misc/genome.softmasked.fa --mask_soft 5000 /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:52 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq data/dna.fna --allow_x --stat info/dna.general /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:53 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq data/dna.fna --allow_x --stat_fasta info/dna.multi_fasta /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:54 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq data/dna.fna --allow_x --substring_n_distr info/dna.gap_distr /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:56 2018] /services/tools/genemark-es/4.33/gmes_petap/gc_distr.pl --in data/dna.fna --out info/dna.gc.csv --w 1000,8000 /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:27:58 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/data/dna.fna --split dna.fa --max_contig 5000000 --min_contig 50000 --letters_per_line 100 --split_at_n 5000 --split_at_x 5000 --allow_x --x_to_n --trace ../../info/training.trace /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:00 2018] /services/tools/genemark-es/4.33/gmes_petap/probuild --seq data/training.fna --stat info/training.general --allow_x --GC_PRECISION 0 /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] /services/tools/genemark-es/4.33/gmes_petap/build_mod.pl --cfg /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run.cfg --section ES_ini --def /services/tools/genemark-es/4.33/gmes_petap/heu_dir/heu_05_gcode_1_gc_53.mod /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] ln -sf /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_ini/es_ini.mod run/ini.mod /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:01 2018] 16 contigs in training /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] ln -sf /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_ini/es_ini.mod /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run/ES_A_1/prev.mod /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] /services/tools/genemark-es/4.33/gmes_petap/parse_set.pl --section ES_A --cfg /home/projects/mg_guests/people/knunie/fun/predict_misc/genemark/run.cfg --v /services/tools/genemark-es/4.33/gmes_petap/gmes_petap.pl : [Fri May 25 21:28:09 2018] error I have genemark-es/4.33 loaded. What could cause this error? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

KnudNorNielsen commented 6 years ago

Thank you for your response.

Its a fungal genome of 38 Mb that have been sequence with PacBio and assembled using CANU. The 25 contigs from CANU have been scaffolded to 19 Scaffolds SSPACE Long-reads. So I don't think that the assembly is to fragmented :-)

I overlooked at first the funannotate-predict.log - It appears that its a licence issue.

[knunie@computerome01 logfiles]$ cat funannotate-predict.log
[05/25/18 11:14:25]: /services/tools/funannotate/1.3.2/bin/funannotate-predict.py -i Funannotate/UCPHDK1.ann.1805025.fa -o fun --species fusarium graminearum
[05/25/18 11:14:25]: OS: linux2, 28 cores, ~ 132 GB RAM. Python: 2.7.14
[05/25/18 11:14:25]: Running funannotate v1.3.2
[05/25/18 11:14:29]: Augustus training set for fusarium_graminearum already exists. To re-train provide unique --augustus_species argument
[05/25/18 11:14:31]: AUGUSTUS (3.3) detected, version seems to be compatible with BRAKER and BUSCO
[05/25/18 11:14:34]: Loading sequences and soft-masking genome
[05/25/18 11:14:34]: Soft-masking: building RepeatModeler database
[05/25/18 11:14:42]: Soft-masking: generating repeat library using RepeatModeler
[05/25/18 15:00:40]: Soft-masking: running RepeatMasker with custom library
[05/25/18 15:17:22]: rmOutToGFF3.pl genome.fasta.out
[05/25/18 15:17:26]: Masked genome: 19 scaffolds; 37,431,244 bp; 2.18% repeats masked
[05/25/18 15:17:41]: Mapping proteins to genome using Diamond blastx/Exonerate
[05/25/18 21:27:31]: /services/tools/augustus/3.3/scripts/exonerate2hints.pl --in=fun/predict_misc/exonerate.out --out=fun/predict_misc/hints.P.gff --minintronlen=10 --maxintronlen=3000
[05/25/18 21:27:32]: perl /services/tools/augustus/3.3/scripts/join_mult_hints.pl
[05/25/18 21:27:32]: Running GeneMark-ES on assembly
[05/25/18 21:27:32]: gmes_petap.pl --ES --max_intron 3000 --soft_mask 5000 --cores 2 --sequence /home/projects/mg_guests/people/knunie/fun/predict_misc/genome.softmasked.fa --fungus
[05/25/18 21:28:09]: (None, 'GeneMark.hmm 400-day license.\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\nGeneMark.hmm 400-day license.\nYour 400-day license period has ended. We hope that you found this\nsoftware useful. If you would like to renew this license,\nplease contact GeneProbe, Inc. at custserv@genepro.com\n\n')
[05/25/18 21:28:09]: GeneMark-ES failed: fun/predict_misc/genemark/output/gmhmm.mod file missing , please check logfiles.

Knud

Knud Nor Nielsen Ph.D Fellow Forest Genetics and Diversity Department of Geoscience and Natural Resource Management University of Copenhagen knn@ign.ku.dk

iwangtoknow commented 6 years ago

I think maybe you have found the PROBLEM. Look at my upper comment, the last second line, GeneMark.hmm 400-day license. My advice is try to solve this. BTW: update to funannotate 1.3.3.

nextgenusfs commented 6 years ago

For those finding similar errors -- to "reactivate" your GeneMark license you have to re-download the gm_key_64 and move into install location.

KnudNorNielsen commented 6 years ago

Funny you point to that - it is exatly what we have done, and it worked. Sorry I have troubled you with this, but my support first claimed that we had a brand new licence, apparetly we diden't. Thank you.

nextgenusfs commented 6 years ago

Great, glad you got it fixed. Well the "license" isn't very obvious with GeneMark and as far as I know it isn't possible to tell how many days you have left -- not really sure why it exists at all....

zrlewis commented 5 years ago

I would add: and make sure to rename the key to .gm_key

jolespin commented 5 years ago

I'm getting this error as well. I have the new gm_key_64 copied in ~/.gm_key and in the same directory as genemark. Is there a way to specify where the key is?

nextgenusfs commented 5 years ago

Not sure how it works on a cluster — perhaps the home directory from your localenv is different than your user account? I actually don’t know how it works exactly, would be a query for the genemark developers.

zrlewis commented 5 years ago

@jolespin From my notes, I did this.

Saved as .gm_key in home, and gm_key_64 in the installation directory 

Not sure which copy did the trick. I'm running it on a cluster

AdamVS commented 10 months ago

For GeneMark-ES v4.72, saving .gm_key in home is ok