nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

Test dataset not working #957

Open gubrins opened 9 months ago

gubrins commented 9 months ago

Are you using the latest release? Yes, funannotate v1.8.16.

Describe the bug I am trying to run funannotate with the test dataset and it does not work.

What command did you issue? funannotate test -t all --cpus 20

Logfiles

#########################################################
Running `funannotate clean` unit testing: minimap2 mediated assembly duplications
CMD: funannotate clean -i test.clean.fa -o test.exhaustive.fa --exhaustive
#########################################################
minimap2 version=2.26-r1175 path=/home/goliath/miniconda3/envs/funannotate/bin/minimap2
-----------------------------------------------
6 input contigs, 6 larger than 500 bp, N50 is 427,039 bp
Checking duplication of 6 contigs
-----------------------------------------------
scaffold_73 appears duplicated: 100% identity over 100% of the contig. contig length: 15153
scaffold_91 appears duplicated: 100% identity over 100% of the contig. contig length: 8858
scaffold_27 appears duplicated: 100% identity over 100% of the contig. contig length: 427039
-----------------------------------------------
6 input contigs; 6 larger than 500 bp; 3 duplicated; 3 written to file
#########################################################
SUCCESS: `funannotate clean` test complete.
#########################################################

#########################################################
Running `funannotate mask` unit testing: RepeatModeler --> RepeatMasker
CMD: funannotate mask -i test.fa -o test.masked.fa --cpus 20
#########################################################
-------------------------------------------------------
[Sep 06 03:08 PM]: OS: Ubuntu 22.04, 128 cores, ~ 528 GB RAM. Python: 3.8.15
[Sep 06 03:08 PM]: Running funanotate v1.8.16
[Sep 06 03:08 PM]: Soft-masking simple repeats with tantan
[Sep 06 03:08 PM]: Repeat soft-masking finished: 
Masked genome: /home/goliath/software/funannotate/test-mask_602a0cea-aac2-4f9f-9c26-d4f3ce919da2/test.masked.fa
num scaffolds: 2
assembly size: 1,216,048 bp
masked repeats: 50,965 bp (4.19%)
-------------------------------------------------------
#########################################################
SUCCESS: `funannotate mask` test complete.
#########################################################

#########################################################
Running `funannotate predict` unit testing
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 20 --species Awesome testicus
#########################################################
-------------------------------------------------------
[Sep 06 03:08 PM]: OS: Ubuntu 22.04, 128 cores, ~ 528 GB RAM. Python: 3.8.15
[Sep 06 03:08 PM]: Running funannotate v1.8.16
[Sep 06 03:08 PM]: Skipping CodingQuarry as no --rna_bam passed
[Sep 06 03:08 PM]: Parsed training data, run ab-initio gene predictors as follows:
  Program      Training-Method
  augustus     pretrained     
  genemark     selftraining   
  glimmerhmm   busco          
  snap         busco          
[Sep 06 03:08 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Sep 06 03:08 PM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
/home/goliath/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-p2g.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import parse_version
[Sep 06 03:08 PM]: Mapping 1,065 proteins to genome using diamond and exonerate
[Sep 06 03:08 PM]: Found 1,505 preliminary alignments with diamond in 0:00:01 --> generated FASTA files for exonerate in 0:00:00
     Progress: 1505 complete, 0 failed, 0 remaining          
[Sep 06 03:08 PM]: Exonerate finished in 0:00:09: found 1,270 alignments
[Sep 06 03:08 PM]: Running GeneMark-ES on assembly
[Sep 06 03:09 PM]: 1,562 predictions from GeneMark
[Sep 06 03:09 PM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Sep 06 03:12 PM]: 370 valid BUSCO predictions found, validating protein sequences
[Sep 06 03:13 PM]: 194 BUSCO predictions validated
[Sep 06 03:13 PM]: Running Augustus gene prediction using saccharomyces parameters
     Progress: 11 complete, 0 failed, 0 remaining        
[Sep 06 03:14 PM]: 1,485 predictions from Augustus
[Sep 06 03:14 PM]: Pulling out high quality Augustus predictions
[Sep 06 03:14 PM]: Found 371 high quality predictions from Augustus (>90% exon evidence)
[Sep 06 03:14 PM]: Running SNAP gene prediction, using training data: annotate/predict_misc/busco.final.gff3
[Sep 06 03:14 PM]: 0 predictions from SNAP
[Sep 06 03:14 PM]: SNAP prediction failed, moving on without result
[Sep 06 03:14 PM]: Running GlimmerHMM gene prediction, using training data: annotate/predict_misc/busco.final.gff3
[Sep 06 03:15 PM]: 737 predictions from GlimmerHMM
[Sep 06 03:15 PM]: Summary of gene models passed to EVM (weights):
  Source         Weight   Count
  Augustus       1        1325 
  Augustus HiQ   2        372  
  GeneMark       1        1562 
  GlimmerHMM     1        737  
  Total          -        3996 
[Sep 06 03:15 PM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
     Progress: 41 complete, 0 failed, 0 remaining        
[Sep 06 03:15 PM]: Converting to GFF3 and collecting all EVM results
[Sep 06 03:15 PM]: Evidence modeler has failed, exiting
#########################################################
Traceback (most recent call last):
  File "/home/goliath/miniconda3/envs/funannotate/bin/funannotate", line 8, in <module>
    sys.exit(main())
  File "/home/goliath/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
    mod.main(arguments)
  File "/home/goliath/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 405, in main
    runPredictTest(args)
  File "/home/goliath/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 160, in runPredictTest
    assert 1500 <= countGFFgenes(os.path.join(
  File "/home/goliath/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 45, in countGFFgenes
    with open(input, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test-predict_602a0cea-aac2-4f9f-9c26-d4f3ce919da2/annotate/predict_results/Awesome_testicus.gff3'

OS/Install Information

-------------------------------------------------------
Checking dependencies for 1.8.16
-------------------------------------------------------
You are running Python v 3.8.15. Now checking python packages...
biopython: 1.76
goatools: 1.3.1
matplotlib: 3.4.3
natsort: 8.4.0
numpy: 1.24.4
pandas: 2.0.3
psutil: 5.7.0
requests: 2.31.0
scikit-learn: 1.3.0
scipy: 1.10.1
seaborn: 0.12.2
All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules...
Carp: 1.50
Clone: 0.46
DBD::SQLite: 1.72
DBD::mysql: 4.046
DBI: 1.643
DB_File: 1.858
Data::Dumper: 2.183
File::Basename: 2.85
File::Which: 1.24
Getopt::Long: 2.54
Hash::Merge: 0.302
JSON: 4.10
LWP::UserAgent: 6.67
Logger::Simple: 2.0
POSIX: 1.94
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.14
Tie::File: 1.06
URI::Escape: 5.17
YAML: 1.30
local::lib: 2.000029
threads: 2.25
threads::shared: 1.61
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/home/goliath/software/funannotate/funannotate_db
$PASAHOME=/home/goliath/miniconda3/envs/funannotate/opt/pasa-2.5.3
$TRINITY_HOME=/home/goliath/miniconda3/envs/funannotate/opt/trinity-2.8.5
$EVM_HOME=/home/goliath/miniconda3/envs/funannotate/opt/evidencemodeler-2.1.0
$AUGUSTUS_CONFIG_PATH=/home/goliath/miniconda3/envs/funannotate/config/
$GENEMARK_PATH=/home/goliath/software/gmes/gmes_linux_64
All 6 environmental variables are set
-------------------------------------------------------
Checking external dependencies...
PASA: 2.5.3
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.4.0
bamtools: bamtools 2.5.1
bedtools: bedtools v2.31.0
blat: BLAT v37x1
diamond: 2.0.8
emapper.py: 2.1.3
ete3: 3.1.3
exonerate: exonerate 2.4.0
fasta: 36.3.8g
glimmerhmm: 3.0.4
gmap: 2023-07-20
gmes_petap.pl: 4.71_lic
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 11.0.20.1
kallisto: 0.46.1
mafft: v7.520 (2023/Mar/22)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.26-r1175
pigz: 2.6
proteinortho: 6.3.0
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.16.1
signalp: environment.
snap: 2006-07-28
stringtie: 2.2.1
tRNAscan-SE: 2.0.12 (Nov 2022)
tantan: tantan 40
tbl2asn: 25.8
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
All 37 external dependencies are installed
gubrins commented 9 months ago

Seems that it improved installing it in a new environment, however I am not sure about this error when still running funannotate test -t all --cpus 20

[Sep 06 05:28 PM]: Predicting secreted proteins with SignalP
Traceback (most recent call last):
  File "/home/goliath/miniconda3/envs/funannotate_florida/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/home/goliath/miniconda3/envs/funannotate_florida/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main
    mod.main(arguments)
  File "/home/goliath/miniconda3/envs/funannotate_florida/lib/python3.8/site-packages/funannotate/annotate.py", line 1331, in main
    lib.signalP(
  File "/home/goliath/miniconda3/envs/funannotate_florida/lib/python3.8/site-packages/funannotate/library.py", line 7205, in signalP
    version = int(version.split(".")[0])
ValueError: invalid literal for int() with base 10: 'environment'
#########################################################
ERROR: `funannotate annotate` test failed - check logfiles
#########################################################

However, at the end I get this:

Sep 06 05:31 PM]: Funannotate compare completed successfully!
#########################################################
SUCCESS: `funannotate compare` test complete.
#########################################################