nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
322 stars 87 forks source link

funannotate predict funannotate.library error when running augustus_parallel.py #424

Closed jasminelmah closed 4 years ago

jasminelmah commented 4 years ago

Hi Jon, I've run into some troubles with funannotate predict. Thanks for any help!

Are you using the latest release? v.1.7.4

Describe the bug funannotate predictfails ataugustus_parallel.py`.

[05:31 AM]: Running Augustus gene prediction using trichoplax_adhaerens parameters
Traceback (most recent call last):
  `"/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/lib/python2.7/site-packages/funannotate/aux_scripts/augustus_parallel.py", line 9, in <module>
    import funannotate.library as lib
ImportError: No module named funannotate.library
[05:31 AM]: Augustus prediction failed, check `logfiles/augustus-parallel.log`

What command did you issue?

funannotate predict -i /home/jlm329/scratch60/funannotate_scratch60/predict/trichoplax.scaffolds.fa.masked \
    --species "Trichoplax adhaerens" \
    --out fun_predict.sort \
    --rna_bam $RNA/Tad_KammSenatore.merged.sort.bam \
    --optimize_augustus \
    --transcript_evidence /home/jlm329/scratch60/funannotate_scratch60/train/fun_train.sort/training/funannotate_train.trinity-GG.fasta \
    --pasa_gff /home/jlm329/scratch60/funannotate_scratch60/train/fun_train.sort/training/funannotate_train.pasa.gff3 \
    --busco_db metazoa --max_intronlen 15000 \
    --organism other -d /home/jlm329/project/trix/funannotate_db --cpus 8 \
    --repeats2evm 

Logfiles

[02:56 AM]: OS: linux2, 20 cores, ~ 131 GB RAM. Python: 2.7.15
[02:56 AM]: Running funannotate v1.7.4
[02:56 AM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[02:56 AM]: Parsed training data, run ab-initio gene predictors as follows:
[02:56 AM]: Loading genome assembly and parsing soft-masked repetitive sequences
[02:56 AM]: Genome loaded: 714 scaffolds; 92,651,896 bp; 6.99% repeats masked
[02:56 AM]: Aligning transcript evidence to genome with minimap2
[02:57 AM]: Found 61,537 alignments, wrote GFF3 and Augustus hints to file
[02:57 AM]: Extracting hints from RNA-seq BAM file using bam2hints
[03:25 AM]: Mapping 549,682 proteins to genome using diamond and exonerate
03:31 AM]: Found 212,113 preliminary alignments --> aligning with exonerate
[04:57 AM]: Exonerate finished: found 1,834 alignments
[04:57 AM]: Filtering PASA data for suitable training set
[04:58 AM]: 2,077 of 7,310 models pass training parameters
[04:58 AM]: Training Augustus using PASA gene models
[04:59 AM]: Augustus initial training results:
[05:31 AM]: Augustus optimized training results:
[05:31 AM]: Running Augustus gene prediction using trichoplax_adhaerens parameters
Traceback (most recent call last):
  File "/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/lib/python2.7/site-packages/funannotate/aux_scripts/augustus_parallel.py", line 9, in <module>
    import funannotate.library as lib
ImportError: No module named funannotate.library
[05:31 AM]: Augustus prediction failed, check `logfiles/augustus-parallel.log`

OS/Install Information

Checking Environmental Variables...
$FUNANNOTATE_DB=/home/jlm329/project/trix/funannotate_db
$PASAHOME=/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/opt/pasa-2.4.1
$TRINITY_HOME=/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/opt/trinity-2.8.5
$EVM_HOME=/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/opt/evidencemodeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/config/
    ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
-------------------------------------------------------
Checking external dependencies...
PASA: 2.4.1
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.3.3
bamtools: bamtools 2.5.1
bedtools: bedtools v2.29.2
blat: BLAT v36
diamond: 0.9.21
emapper.py: 2.0.1
ete3: 3.1.1
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
hisat2: 2.2.0
hmmscan: HMMER 3.3 (Nov 2019)
hmmsearch: HMMER 3.3 (Nov 2019)
java: 11.0.1-internal
kallisto: 0.46.2
mafft: v7.464 (2020/Apr/21)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.17-r941
proteinortho: 6.0.16
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.9
snap: 2006-07-28
stringtie: 2.1.2
tRNAscan-SE: 2.0.5 (October 2019)
tantan: tantan 13
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: Error occurred during initialization of VM
    ERROR: gmes_petap.pl not installed
    ERROR: signalp not installed
nextgenusfs commented 4 years ago

How did you install? That would suggest that the python version that is in your $PATH is not the same as what is running funannotate, ie funannotate python package is not installed properly.

jasminelmah commented 4 years ago

Thanks for the reply!

I used conda: conda create -c conda-forge -c bioconda -n funannotate funannotate I'll check the python version in $PATH.

jasminelmah commented 4 years ago

Python 2.7.15 is in the funannotate bin of my conda environment and is activated when I use python interactively in the funannotate environment. In interactive, it uses biopython just fine and loads the funannotate library with no error. The job, however, still fails at the same line in augustus_parallel.py.

nextgenusfs commented 4 years ago

Strange.... even stranger is that the parent script calls that exact same library..... Can you call the script directly from that environment, ie /gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/lib/python2.7/site-packages/funannotate/aux_scripts/augustus_parallel.py -h. I'm not sure what is going on unfortunately. I don't see how that script could be not working? The other thing to try would be to make sure its executable I guess, look at the ownership like:

ls -l /gpfs/ysm/project/dunn/jlm329/conda_envs/funannotate/lib/python2.7/site-packages/funannotate/aux_scripts/
jasminelmah commented 4 years ago

I can directly call augustus_parallel.py -h from my funannotate environment. ls -l also shows that it is executable.

It looks like two other people have run into this problem with augustus_parallel.py in issue 405.

jasminelmah commented 4 years ago

The predict run worked! What seems to have happened is that I may have submitted the script from a messy environment. I started a new ssh session, loaded only miniconda and not the funannotate environment and submitted the job.

hyphaltip commented 2 years ago

if you are running on a slurm envirnoment you may want to use #!/usr/bin/bash -l to ensure a clean load of modules each time.

aldendirks commented 1 year ago

I was getting this same error running from a conda environment on a computing cluster with slurm. I was loading python again in my script (module load python), which was not needed (and detrimental) since the correct python was already installed in the conda environment. A rookie mistake; hopefully this comment will help others avoid it.