nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

test -t predict failed at augustus #993

Closed ZeweiSong closed 5 months ago

ZeweiSong commented 5 months ago

I understood that there are already many issues on augustus, but I've went through those issues but did not find one simialr to my case. So here it is:

I've install funannotate with augustus=3.3.3, and here is the output for

funannoate test -t predict --cpus 4

Seems it is something with augustus, but I cannot figure it out with my capacity.

Thanks for any suggestions!

Zewei

#########################################################
Running `funannotate predict` unit testing
Downloading: https://osf.io/te2pf/download?version=1 Bytes: 1489808
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 4 --species Awesome testicus
#########################################################
-------------------------------------------------------
[Jan 05 11:08 AM]: OS: Ubuntu 22.04, 16 cores, ~ 8 GB RAM. Python: 3.8.15
[Jan 05 11:08 AM]: Running funannotate v1.8.16
[Jan 05 11:08 AM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
Traceback (most recent call last):
  File "/home/zewei/mambaforge/envs/funannotate/bin/funannotate", line 8, in <module>
    sys.exit(main())
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
    mod.main(arguments)
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/predict.py", line 708, in main
    for f in os.listdir(os.path.join(LOCALAUGUSTUS, "species", aug_species)):
FileNotFoundError: [Errno 2] No such file or directory: 'annotate/predict_misc/ab_initio_parameters/augustus/species/saccharomyces'
#########################################################
Traceback (most recent call last):
  File "/home/zewei/mambaforge/envs/funannotate/bin/funannotate", line 8, in <module>
    sys.exit(main())
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
    mod.main(arguments)
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 405, in main
    runPredictTest(args)
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 160, in runPredictTest
    assert 1500 <= countGFFgenes(os.path.join(
  File "/home/zewei/mambaforge/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 45, in countGFFgenes
    with open(input, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test-predict_cbf092b8-8916-4b5c-97b0-2edec6b43ae9/annotate/predict_results/Awesome_testicus.gff3'
ZeweiSong commented 5 months ago

Somehow it seems the scripts failed to copy the trained augsutus model to the testing folder. I can run the command if I delete the --augustus_species option like this:

funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --cpus 4 --species "Awesome testicus"

-------------------------------------------------------
[Jan 05 12:46 PM]: OS: Ubuntu 22.04, 16 cores, ~ 8 GB RAM. Python: 3.8.15
[Jan 05 12:46 PM]: Running funannotate v1.8.16
[Jan 05 12:46 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Jan 05 12:46 PM]: Skipping CodingQuarry as no --rna_bam passed
[Jan 05 12:46 PM]: Parsed training data, run ab-initio gene predictors as follows:
  Program      Training-Method
  augustus     busco
  glimmerhmm   busco
  snap         busco
[Jan 05 12:46 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Jan 05 12:46 PM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
[Jan 05 12:46 PM]: Existing protein alignments found: annotate/predict_misc/protein_alignments.gff3
[Jan 05 12:46 PM]: Existing BUSCO results found: annotate/predict_misc/busco.final.gff3 containing 197 predictions
[Jan 05 12:46 PM]: Not enough gene models 197 to train Augustus (200 required), exiting
ZeweiSong commented 5 months ago

funannotate check

-------------------------------------------------------
Checking dependencies for 1.8.16
-------------------------------------------------------
To print all dependencies and versions: funannotate check --show-versions

You are running Python v 3.8.15. Now checking python packages...
All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules...
All 27 Perl modules installed

Checking Environmental Variables...
        ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
-------------------------------------------------------
Checking external dependencies...
Couldn't find MMseqs2: /bin/sh: 1: /home/zewei/mambaforge/envs/eggnog-mapper-2.1.9/lib/python3.12/site-packages/eggnogmapper/bin/mmseqs: not found
        ERROR: gmes_petap.pl not installed
        ERROR: signalp not installed