nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

Evidence modeler has failed, exiting #1042

Open genbuf opened 1 month ago

genbuf commented 1 month ago

Hi Jon, I am running funannotate in a conda env created recently, But when I executed the test commands "funannotate test -t all --cpus 5", It reported an error, What should I do to make funannotate run correctly? I hope you can help me.

Thanks! Here is the error message for the test command

######################################################### Running funannotate clean unit testing: minimap2 mediated assembly duplications Downloading: https://osf.io/8pjbe/download?version=1 Bytes: 252076
6 input contigs, 6 larger than 500 bp, N50 is 427,039 bp Checking duplication of 6 contigs

minimap2 version=2.28-r1209 path=/public/software/anaconda3/envs/funannotate/bin/minimap2 scaffold_73 appears duplicated: 100% identity over 100% of the contig. contig length: 15153 scaffold_91 appears duplicated: 100% identity over 100% of the contig. contig length: 8858 scaffold_27 appears duplicated: 100% identity over 100% of the contig. contig length: 427039

6 input contigs; 6 larger than 500 bp; 3 duplicated; 3 written to file CMD: funannotate clean -i test.clean.fa -o test.exhaustive.fa --exhaustive ######################################################### ######################################################### SUCCESS: funannotate clean test complete. #########################################################

######################################################### Running funannotate mask unit testing: RepeatModeler --> RepeatMasker Downloading: https://osf.io/hbryz/download?version=1 Bytes: 375687
[May 16 10:53 AM]: OS: CentOS Linux 7, 20 cores, ~ 131 GB RAM. Python: 3.8.19 [May 16 10:53 AM]: Running funanotate v1.8.17 [May 16 10:53 AM]: Soft-masking simple repeats with tantan [May 16 10:53 AM]: Repeat soft-masking finished: Masked genome: /public/home/user/dir/fungal_protocol/download_new/funannotate_test/test-mask_81613ea8-b05d-4f7b-9be5-66c4b0ef3946/test.masked.fa num scaffolds: 2 assembly size: 1,216,048 bp masked repeats: 50,965 bp (4.19%)


CMD: funannotate mask -i test.fa -o test.masked.fa --cpus 5 ######################################################### ######################################################### SUCCESS: funannotate mask test complete. #########################################################

######################################################### Running funannotate predict unit testing Downloading: https://osf.io/te2pf/download?version=1 By-------------------------------------------------------]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]] [May 16 10:53 AM]: OS: CentOS Linux 7, 20 cores, ~ 131 GB RAM. Python: 3.8.19 [May 16 10:53 AM]: Running funannotate v1.8.17 [May 16 10:53 AM]: Skipping CodingQuarry as no --rna_bam passed [May 16 10:53 AM]: Parsed training data, run ab-initio gene predictors as follows: Program Training-Method augustus pretrained genemark selftraining glimmerhmm busco snap busco [May 16 10:53 AM]: Loading genome assembly and parsing soft-masked repetitive sequences [May 16 10:53 AM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked /public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate-1.8.17-py3.8.egg/funannotate/aux_scripts/funannotate-p2g.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html from pkg_resources import parse_version [May 16 10:53 AM]: Mapping 1,065 proteins to genome using diamond and exonerate [May 16 10:53 AM]: Found 1,505 preliminary alignments with diamond in 0:00:03 --> generated FASTA files for exonerate in 0:00:00 [May 16 10:54 AM]: Exonerate finished in 0:00:34: found 1,270 alignments Progress: 1505 complete, 0 failed, 0 remaining [May 16 10:54 AM]: Running GeneMark-ES on assembly [May 16 10:57 AM]: 1,556 predictions from GeneMark [May 16 10:57 AM]: Running BUSCO to find conserved gene models for training ab-initio predictors [May 16 11:07 AM]: 370 valid BUSCO predictions found, validating protein sequences [May 16 11:09 AM]: 207 BUSCO predictions validated [May 16 11:09 AM]: Running Augustus gene prediction using saccharomyces parameters [May 16 11:11 AM]: 1,485 predictions from Augustus Progress: 11 complete, 0 failed, 0 remaining [May 16 11:11 AM]: Pulling out high quality Augustus predictions [May 16 11:11 AM]: Found 371 high quality predictions from Augustus (>90% exon evidence) [May 16 11:11 AM]: Running SNAP gene prediction, using training data: annotate/predict_misc/busco.final.gff3 [May 16 11:12 AM]: 0 predictions from SNAP [May 16 11:12 AM]: SNAP prediction failed, moving on without result [May 16 11:12 AM]: Running GlimmerHMM gene prediction, using training data: annotate/predict_misc/busco.final.gff3 [May 16 11:14 AM]: 413 predictions from GlimmerHMM [May 16 11:14 AM]: Summary of gene models passed to EVM (weights): [May 16 11:14 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval [May 16 11:14 AM]: Converting to GFF3 and collecting all EVM results Progress: 0 complete, 0 failed, 0 remaining Source Weight Count Augustus 1 1325 Augustus HiQ 2 372 GeneMark 1 1556 GlimmerHMM 1 413 Total - 3666 [May 16 11:14 AM]: Evidence modeler has failed, exiting CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 5 --species Awesome testicus ######################################################### ######################################################### Traceback (most recent call last): File "/public/software/anaconda3/envs/funannotate/bin/funannotate", line 33, in sys.exit(load_entry_point('funannotate==1.8.17', 'console_scripts', 'funannotate')()) File "/public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate-1.8.17-py3.8.egg/funannotate/funannotate.py", line 717, in main mod.main(arguments) File "/public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate-1.8.17-py3.8.egg/funannotate/test.py", line 405, in main runPredictTest(args) File "/public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate-1.8.17-py3.8.egg/funannotate/test.py", line 160, in runPredictTest assert 1500 <= countGFFgenes(os.path.join( File "/public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate-1.8.17-py3.8.egg/funannotate/test.py", line 45, in countGFFgenes with open(input, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'test-predict_81613ea8-b05d-4f7b-9be5-66c4b0ef3946/annotate/predict_results/Awesome_testicus.gff3'

genbuf commented 1 month ago

Here is the install information


Checking dependencies for 1.8.17

You are running Python v 3.8.19. Now checking python packages... biopython: 1.76 goatools: 1.3.11 matplotlib: 3.7.5 natsort: 8.4.0 numpy: 1.22.3 pandas: 2.0.3 psutil: 5.9.8 requests: 2.31.0 scikit-learn: 1.3.2 scipy: 1.10.1 seaborn: 0.13.2 All 11 python packages installed

You are running Perl v b'5.026002'. Now checking perl modules... Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.302 JSON: 4.02 LWP::UserAgent: 6.77 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 1.17 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.28 local::lib: 2.000024 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed

Checking Environmental Variables... $FUNANNOTATE_DB=/public/database/funannotate_db $PASAHOME=/public/software/anaconda3/envs/funannotate/opt/pasa-2.3.3 $TRINITY_HOME=/public/software/anaconda3/envs/funannotate/opt/trinity-2.1.1/ $EVM_HOME=/public/software/anaconda3/envs/funannotate/opt/evidencemodeler-2.1.0/ $AUGUSTUS_CONFIG_PATH=/public/software/augustus-3.4.0/config $GENEMARK_PATH=/public/software/gmes_linux_64_4/ All 6 environmental variables are set

Checking external dependencies... samtools: /public/software/anaconda3/envs/funannotate/bin/../lib/libtinfow.so.6: no version information available (required by samtools) samtools: /public/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools) samtools: /public/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools) PASA: 2.3.3 CodingQuarry: 2.0 Trinity: Trinity version: v2.1.1 augustus: 3.4.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.26.0 blat: BLAT v35 diamond: 2.1.8 emapper.py: There was an error retrieving eggnog-mapper DB data: not a valid file "/public/software/anaconda3/envs/funannotate/lib/python3.8/site-packages/data/eggnog.db" Maybe you need to run download_eggnog_data.py emapper-2.1.12 / Expected eggNOG DB version: 5.0.2 / Installed eggNOG DB version: unknown / Diamond found: diamond 2.1.8 / MMseqs2 found: 13.45111 / Compatible novel families DB version: 1.0.1

ete3: 3.1.3 exonerate: exonerate 2.4.0 fasta: 36.3.8e glimmerhmm: 3.0.4 gmap: 2024-03-15 gmes_petap.pl: 4.71_lic hisat2: 2.2.1 hmmscan: HMMER 3.1b2 (February 2015) hmmsearch: HMMER 3.1b2 (February 2015) java: 1.7.0_91 kallisto: 0.50.1 mafft: v7.525 (2024/Mar/13) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.28-r1209 pigz: 2.8 proteinortho: 6.3.1 pslCDnaFilter: no way to determine salmon: salmon 0.13.0 samtools: samtools 1.19.2 snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.12 (Nov 2022) tantan: tantan 49 tbl2asn: 25.8 tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: signalp not installed