nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
302 stars 82 forks source link

funannotate update Missing Dependencies: fasta. Please install missing dependencies and re-run script #307

Closed lorgas99 closed 4 years ago

lorgas99 commented 4 years ago

Hi Jon,

I am trying to run funannotate update in a conda environment on a non-model fish but I get the following error message "Missing Dependencies: fasta. Please install missing dependencies and re-run script". (also in the log file)

I have installed PASA and fasta36, although --funannotate check --show-versions returns "ERROR: fasta not installed"

Previously I run funannotate predict (I tried to run softmask the genome but it returned an error, so I run predict with --force like: funannotate predict -i Symphodus_melops.fasta -o funannotate_predict_test --species "Symphodus melops" --organism other --force --busco_db actinopterygii --transcript_evidence trinity.fasta --ploidy 2 --repeats2evm --stringtie stringtie_merged.gtf --cpus 40 --rna_bam 01_Melops1188.bam 02_Melops1195.bam

After completion it asked me to run "update": [10:15 PM]: Collecting final annotation files for 62,314 total gene models [10:15 PM]: Funannotate predict is finished, output files are in the /cluster/home/enriqubg/funannotate_predict_test/predict_results folder [10:15 PM]: Your next step to capture UTRs and update annotation using PASA:

funannotate update -i /cluster/home/enriqubg/funannotate_predict_test --cpus 40 \ --left illumina_forward_RNAseq_R1.fastq.gz \ --right illumina_forward_RNAseq_R2.fastq.gz \ --jaccard_clip

I would really appreciate any suggestions. Thanks.

These are my dependencies: funannotate check --show-versions

Checking dependencies for funannotate v1.5.2

You are running Python v 2.7.16. Now checking python packages... biopython: 1.74 goatools: 0.9.5 matplotlib: 2.2.4 natsort: 6.0.0 numpy: 1.16.4 pandas: 0.24.2 psutil: 5.6.3 requests: 2.22.0 scikit-learn: 0.20.3 scipy: 1.2.1 seaborn: 0.9.0 All 11 python packages installed

You are running Perl v 5.026002. Now checking perl modules... Bio::Perl: 1.007002 Carp: 1.38 Clone: 0.41 DBD::SQLite: 1.62 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.852 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.15 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed

Checking external dependencies... RepeatMasker: RepeatMasker 4.0.8 RepeatModeler: RepeatModeler version DEV Trinity: 2.8.5 augustus: 3.2.3 bamtools: bamtools 2.4.1 bedtools: bedtools v2.28.0 blat: BLAT v36 diamond: diamond 0.9.19 emapper.py: emapper-0.12.7 ete3: 3.1.1 exonerate: exonerate 2.4.0 gmap: 2017-11-15 hisat2: 2.1.0 hmmscan: HMMER 3.1b2 (February 2015) hmmsearch: HMMER 3.1b2 (February 2015) java: 11.0.1 kallisto: 0.46.0 mafft: v7.407 (2018/Jul/23) makeblastdb: makeblastdb 2.6.0+ minimap2: 2.17-r941 nucmer: 3.1 pslCDnaFilter: no way to determine rmblastn: rmblastn 2.6.0+ samtools: samtools 1.9 stringtie: 1.3.6 tRNAscan-SE: 2.0.3 (April 2019) tbl2asn: unknown, likely 25.3 tblastn: tblastn 2.6.0+ trimal: trimAl v1.4.rev15 build[2013-12-17] ERROR: CodingQuarry not installed ERROR: fasta not installed ERROR: gmes_petap.pl not installed Checking Environmental Variables... $FUNANNOTATE_DB=/projects/bin/funannotate/db $PASAHOME=/cluster/home/enriqubg/miniconda2/opt/pasa-2.3.3 $TRINITYHOME=/cluster/home/enriqubg/miniconda2/opt/trinity-2.8.5 $EVM_HOME=/cluster/home/enriqubg/miniconda2/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/cluster/home/enriqubg/miniconda2/config $GENEMARK_PATH=/projects/bin/genemark/4.38/gm_et_linux_64/gmes_petap $BAMTOOLS_PATH=/cluster/home/enriqubg/miniconda2/bin All 7 environmental variables are set

hyphaltip commented 4 years ago

did you symlink fasta36 to fasta in the folder where it is installed? ln -s fasta36 fasta

lorgas99 commented 4 years ago

Yes, I did ln -s fasta36 fasta inside /cluster/home/enriqubg/miniconda2/pkgs/fasta-36.3.8e-1/bin

From: Jason Stajich notifications@github.com Sent: Tuesday, July 23, 2019 16:20 To: nextgenusfs/funannotate funannotate@noreply.github.com Cc: Enrique Blanco Gonzalez enrique.blanco@uit.no; Author author@noreply.github.com Subject: Re: [nextgenusfs/funannotate] funannotate update Missing Dependencies: fasta. Please install missing dependencies and re-run script (#307)

did you symlink fasta36 to fasta in the folder where it is installed? ln -s fasta36 fasta

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/nextgenusfs/funannotate/issues/307?email_source=notifications&email_token=AMJAUB5HE675WMZUNCUXSETQA4HPPA5CNFSM4IGE26VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2TISVQ#issuecomment-514230614, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMJAUB34JEQFENIKGRXFZ6TQA4HPPANCNFSM4IGE26VA.

nextgenusfs commented 4 years ago

So you can type ‘fasta’ and get the fasta36 help menu? That is all the check script is doing.

lorgas99 commented 4 years ago

No, actually fasta does not return anything “command not found”

nextgenusfs commented 4 years ago

Okay then it isn’t properly symlinked.

lorgas99 commented 4 years ago

You were right, I did it to the file in the wrong location. Now it is working. Thanks a lot for your prompt and kind response!

Okay then it isn’t properly symlinked.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/nextgenusfs/funannotate/issues/307?email_source=notifications&email_token=AMJAUB26G3UUPGSS2DM7SXLQA4KXRA5CNFSM4IGE26VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2TLSWA#issuecomment-514242904, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMJAUB7D254CBV6NTQFI6LTQA4KXRANCNFSM4IGE26VA.

lorgas99 commented 4 years ago

Hi again,

I am trying to run funannotate update but I am having problems to get it through. I would appreciate any hint about what may be going on. My command is : funannotate update -i /cluster/home/enriqubg/funannotate_predict_test --cpus 40 --left Melops_R1.fastq.gz --right Melops_R2.fastq.gz --jaccard_clip --species Symphodus melops

I got the following error "Alignment failed, BAM files empty" (log file copied at the bottom). Based on the log file, it seems that the normalized R1.file from Trinity has not completed and it passed the error. In case it may be relevant, the successful funannotate predict was run without train.

I was concerned that the problem may be related to providing several .sort.bam files to funannotate predict, so I decided to run funannotate train to let funannotate create the bam files before rerunning the predict step again like this: funannotate train -i Symphodus_melops.fasta -o funannotate_train --left Melops_R1.fastq.gz --right Melops_R2.fastq.gz --trinity Trinity.fasta --memory 120G --pasa_db mysql --jaccard_clip --cpus 40 --species "Symphodus melops" --stranded RF

but continued getting the same error seqclean running options: seqclean trinity.fasta -c 2 Standard log file: seqcl_trinity.fasta.log Error log file: err_seqcl_trinity.fasta.log Using 2 CPUs for cleaning -= Rebuilding trinity.fasta cdb index =- Launching actual cleaning process: psx -p 2 -n 1000 -i trinity.fasta -d cleaning -C '/usit/abel/u1/enriqubg/trinity.fasta:ANLMS100:::11:0' -c '/cluster/home/enriqubg/miniconda2/opt/pasa-2.3.3/bin/seqclean.psx' Error at 'psx -p 2 -n 1000 -i trinity.fasta -d cleaning -C '/usit/abel/u1/enriqubg/trinity.fasta:ANLMS100:::11:0' -c '/cluster/home/enriqubg/miniconda2/opt/pasa-2.3.3/bin/seqclean.psx''

Process terminated with an error! seqclean (trinity.fasta) encountered an error. Working directory was /usit/abel/u1/enriqubg seqclean (trinity.fasta) encountered an error. Working directory was /usit/abel/u1/enriqubg Alignment failed, BAM files empty. Please check logfile

I have not been able to figure what the problem may be and would really appreciate any suggestion. Thanks,

[07/25/19 00:04:37]: OS: linux2, 32 cores, ~ 66 GB RAM. Python: 2.7.16 [07/25/19 00:04:38]: Running funannotate v1.5.2 [07/25/19 00:04:42]: No NCBI SBT file given, will use default, for NCBI submissions pass one here '--sbt' [07/25/19 00:07:40]: Reannotating Symphodus melops, NCBI accession: None [07/25/19 00:07:40]: Previous annotation consists of: 60,121 protein coding gene models and 2,193 non-coding gene models [07/25/19 00:07:40]: Input reads: ('/cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R1.fastq.gz', '/cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R2.fastq.gz', None) [07/25/19 00:07:40]: Adapter and Quality trimming PE reads with Trimmomatic [07/25/19 00:07:40]: java -jar /usit/abel/u1/enriqubg/miniconda2/opt/trinity-2.8.5/trinity-plugins/Trimmomatic-0.36/trimmomatic.jar PE -threads 40 -phred33 /cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R1.fastq.gz /cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R2.fastq.gz /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.unpaired.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.unpaired.fastq ILLUMINACLIP:/usit/abel/u1/enriqubg/miniconda2/opt/trinity-2.8.5/trinity-plugins/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 [07/25/19 00:15:22]: TrimmomaticPE: Started with arguments: -threads 40 -phred33 /cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R1.fastq.gz /cluster/home/enriqubg/nodes/wrasse_RNA/Illumina_fastq_gz/Melops_R2.fastq.gz /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.unpaired.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.fastq /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.unpaired.fastq ILLUMINACLIP:/usit/abel/u1/enriqubg/miniconda2/opt/trinity-2.8.5/trinity-plugins/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT' ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

[07/25/19 00:15:22]: gzip -f /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.fastq [07/25/19 01:32:23]: gzip -f /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.unpaired.fastq [07/25/19 01:32:24]: gzip -f /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.fastq [07/25/19 02:49:51]: gzip -f /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.unpaired.fastq [07/25/19 02:49:52]: Quality trimmed reads: ('/cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.fastq.gz', '/cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.fastq.gz', None) [07/25/19 02:49:52]: /usit/abel/u1/enriqubg/miniconda2/opt/trinity-2.8.5/util/insilico_read_normalization.pl --PARALLEL_STATS --JM 50G --max_cov 50 --seqType fq --output /cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize --CPU 40 --pairs_together --left /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_left.fastq.gz --right /cluster/home/enriqubg/funannotate_predict_test/update_misc/trimmomatic/trimmed_right.fastq.gz [07/25/19 02:56:55]: Normalized reads: ('/cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize/left.norm.fq', '/cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize/right.norm.fq', None) [07/25/19 02:56:55]: Long reads: (None, None, None) [07/25/19 02:56:55]: Long reads FASTA format: (None, None, None) [07/25/19 02:56:55]: Long SeqCleaned reads: (None, None, None) [07/25/19 02:56:55]: Starting Trinity genome guided [07/25/19 02:56:55]: Building Hisat2 genome index, incorporating exons and splice-sites [07/25/19 02:56:55]: hisat2-build --exon /cluster/home/enriqubg/funannotate_predict_test/update_misc/genome.exons --ss /cluster/home/enriqubg/funannotate_predict_test/update_misc/genome.ss /cluster/home/enriqubg/funannotate_predict_test/update_misc/genome.fa /cluster/home/enriqubg/funannotate_predict_test/update_misc/hisat2.genome [07/25/19 03:43:23]: Aligning reads to genome using Hisat2 [07/25/19 03:43:23]: /projects/cees/bin/funannotate/py2/funannotate-1.5.2/util/sam2bam.sh hisat2 -p 40 --max-intronlen 3000 --dta -x /cluster/home/enriqubg/funannotate_predict_test/update_misc/hisat2.genome -1 /cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize/left.norm.fq -2 /cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize/right.norm.fq 4 /cluster/home/enriqubg/funannotate_predict_test/update_misc/hisat2.coordSorted.bam [07/25/19 03:43:25]: Warning: Could not open read file "/cluster/home/enriqubg/funannotate_predict_test/update_misc/normalize/left.norm.fq" for reading; skipping... Error: No input read files were valid (ERR): hisat2-align exited with value 1

[07/25/19 03:43:25]: Running genome-guided Trinity, logfile: /cluster/home/enriqubg/funannotate_predict_test/update_misc/Trinity-gg.log [07/25/19 03:43:25]: Clustering of reads from BAM and preparing assembly commands [07/25/19 03:43:25]: Trinity --no_distributed_trinity_exec --genome_guided_bam /cluster/home/enriqubg/funannotate_predict_test/update_misc/hisat2.coordSorted.bam --genome_guided_max_intron 3000 --CPU 40 --max_memory 50G --output /cluster/home/enriqubg/funannotate_predict_test/update_misc/trinity_gg --jaccard_clip [07/25/19 03:43:33]: Assembling 0 Trinity clusters using 39 CPUs [07/25/19 03:43:34]: /usit/abel/u1/enriqubg/miniconda2/opt/trinity-2.8.5/util/support_scripts/GG_partitioned_trinity_aggregator.pl Trinity_GG [07/25/19 03:43:34]: StringTie installed, running StringTie on Hisat2 coordsorted BAM [07/25/19 03:43:34]: stringtie -p 40 /cluster/home/enriqubg/funannotate_predict_test/update_misc/hisat2.coordSorted.bam [07/25/19 03:43:34]: Alignment failed, BAM files empty. Please check logfile

nextgenusfs commented 4 years ago

On some clusters seqclean doesn’t work, follow the links here to get the updated psx code and recompile. https://github.com/nextgenusfs/funannotate/issues/288

lorgas99 commented 4 years ago

Thanks once again, I will do it and let you know how it works.