nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
301 stars 82 forks source link

Process stalls indefinitely when running predict #863

Closed fereyj closed 1 year ago

fereyj commented 1 year ago

Are you using the latest release? Using funannotate 1.8.13 with Python 3.8.15 and Ubuntu 22.04.1 LTS

Describe the bug I cannot get predict to work. Whenever I use the predict command, it processes indefinitely (I have let it run 48 hrs without success). This occurs with the test and with any sample I try.

What command did you issue? funannotate test -t all --cpus 8

Logfiles

(funannotate) igseq@igseq-Precision-5820-Tower:~$ funannotate test -t all --cpus 20
#########################################################
Running `funannotate clean` unit testing: minimap2 mediated assembly duplications
Downloading: https://osf.io/8pjbe/download?version=1 Bytes: 252076
CMD: funannotate clean -i test.clean.fa -o test.exhaustive.fa --exhaustive
#########################################################
minimap2 version=2.24-r1122 path=/home/igseq/miniconda3/envs/funannotate/bin/minimap2
-----------------------------------------------
6 input contigs, 6 larger than 500 bp, N50 is 427,039 bp
Checking duplication of 6 contigs
-----------------------------------------------
scaffold_73 appears duplicated: 100% identity over 100% of the contig. contig length: 15153
scaffold_91 appears duplicated: 100% identity over 100% of the contig. contig length: 8858
scaffold_27 appears duplicated: 100% identity over 100% of the contig. contig length: 427039
-----------------------------------------------
6 input contigs; 6 larger than 500 bp; 3 duplicated; 3 written to file
#########################################################
SUCCESS: `funannotate clean` test complete.
#########################################################

#########################################################
Running `funannotate mask` unit testing: RepeatModeler --> RepeatMasker
Downloading: https://osf.io/hbryz/download?version=1 Bytes: 375687
CMD: funannotate mask -i test.fa -o test.masked.fa --cpus 20
#########################################################
-------------------------------------------------------
[Feb 08 03:49 PM]: OS: Ubuntu 22.04, 8 cores, ~ 33 GB RAM. Python: 3.8.15
[Feb 08 03:49 PM]: Running funanotate v1.8.13
[Feb 08 03:49 PM]: Soft-masking simple repeats with tantan
[Feb 08 03:49 PM]: Repeat soft-masking finished: 
Masked genome: /home/igseq/test-mask_8ce967fe-be6a-43e3-b849-70cc60169d41/test.masked.fa
num scaffolds: 2
assembly size: 1,216,048 bp
masked repeats: 50,965 bp (4.19%)
-------------------------------------------------------
#########################################################
SUCCESS: `funannotate mask` test complete.
#########################################################

#########################################################
Running `funannotate predict` unit testing
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 20 --species Awesome testicus
#########################################################
-------------------------------------------------------
[Feb 08 03:49 PM]: OS: Ubuntu 22.04, 8 cores, ~ 33 GB RAM. Python: 3.8.15
[Feb 08 03:49 PM]: Running funannotate v1.8.13
[Feb 08 03:49 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
^CTraceback (most recent call last):
  File "/home/igseq/miniconda3/envs/funannotate/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main
    mod.main(arguments)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/predict.py", line 253, in main
    if lib.which('bam2hints'):
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/library.py", line 852, in which
    subprocess.Popen([name], stdout=devnull,
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1020, in communicate
    self.wait()
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1083, in wait
    return self._wait(timeout=timeout)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1806, in _wait
    (pid, sts) = self._try_wait(0)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1764, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt
Traceback (most recent call last):
  File "/home/igseq/miniconda3/envs/funannotate/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main
    mod.main(arguments)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 405, in main
    runPredictTest(args)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 153, in runPredictTest
    runCMD(['funannotate', 'predict', '-i', inputFasta,
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 55, in runCMD
    subprocess.call(cmd, cwd=dir)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 342, in call
    return p.wait(timeout=timeout)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1083, in wait
    return self._wait(timeout=timeout)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1806, in _wait
    (pid, sts) = self._try_wait(0)
  File "/home/igseq/miniconda3/envs/funannotate/lib/python3.8/subprocess.py", line 1764, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

**OS/Install Information**
(funannotate) igseq@igseq-Precision-5820-Tower:~$ funannotate check --show-versions
-------------------------------------------------------
Checking dependencies for 1.8.13
-------------------------------------------------------
You are running Python v 3.8.15. Now checking python packages...
biopython: 1.80
goatools: 1.2.3
matplotlib: 3.4.3
natsort: 8.2.0
numpy: 1.24.1
pandas: 1.5.3
psutil: 5.9.4
requests: 2.28.2
scikit-learn: 1.2.1
scipy: 1.10.0
seaborn: 0.12.2
All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules...
Carp: 1.50
Clone: 0.46
DBD::SQLite: 1.72
DBD::mysql: 4.050
DBI: 1.643
DB_File: 1.855
Data::Dumper: 2.183
File::Basename: 2.85
File::Which: 1.24
Getopt::Long: 2.54
Hash::Merge: 0.302
JSON: 4.10
LWP::UserAgent: 6.67
Logger::Simple: 2.0
POSIX: 1.94
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.14
Tie::File: 1.06
URI::Escape: 5.12
YAML: 1.30
local::lib: 2.000029
threads: 2.25
threads::shared: 1.61
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/home/igseq/Documents/funannotate_db
$PASAHOME=/home/igseq/miniconda3/envs/funannotate/opt/pasa-2.5.2
$TRINITY_HOME=/home/igseq/miniconda3/envs/funannotate/opt/trinity-2.8.5
$EVM_HOME=/home/igseq/miniconda3/envs/funannotate/opt/evidencemodeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/home/igseq/miniconda3/envs/funannotate/config/
    ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
-------------------------------------------------------
Checking external dependencies...
PASA: 2.5.2
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.5.0
bamtools: bamtools 2.5.1
bedtools: bedtools v2.30.0
blat: BLAT v35
diamond: 2.0.15
ete3: 3.1.2
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2021-08-25
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 17.0.3-internal
kallisto: 0.46.1
mafft: v7.515 (2023/Jan/15)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.24-r1122
pigz: pigz 2.6
proteinortho: 6.1.7
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.16.1
snap: 2006-07-28
stringtie: 2.2.1
tRNAscan-SE: 2.0.11 (Oct 2022)
tantan: tantan 40
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
    ERROR: emapper.py not installed
    ERROR: gmes_petap.pl not installed
    ERROR: signalp not installed
nextgenusfs commented 1 year ago

This is related to the same problem everybody else is having, its augustus. The code in funannotate v1.8.13 and below is not compatible with augustus >= v3.4. First thing to try is to upgrade funannotate to master and see if it works then, if it still doesn't work (please let me know here and provide the errors) but then downgrade augustus to < v3.4. You can upgrade funannotate with pip from that environment python -m pip install git+https://github.com/nextgenusfs/funannotate.git --upgrade --force --no-deps.

I've not tagged a new release as I've not had time to test with newer versions of augustus. And just an FYI -- there are many broken versions of augustus out there.

fereyj commented 1 year ago

This was very helpful, thank you. Upgrading funannotate as you described fixed the issue. The funannotate test -t all command worked. I am having a different issue now, but I will open a new ticket. Thank you again.