nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

Diamond CMD ERROR when using funannotate annotate #949

Open lichen-fungus opened 10 months ago

lichen-fungus commented 10 months ago

Are you using the latest release? funannotate -version funannotate v1.8.15

Describe the bug I have been using funannotate successfully many times for other fungal species, but for one of them I got an error at the end of the annotation process. funannotate annotate does not finish, throwing a CMD error when diamond is run. The diamond input file smcluster.proteins.fasta is empty for some reason. I already once tried rerunning from scratch after deleting all intermediate files, but again it wouldn't finish.

What command did you issue? funannotate annotate -i /data/scratch/software/funannotate/analysis/my_species_pred/ --cpus 10 -d /data/scratch/software/funannotate/analysis/funannotate_db/ --no-progress --busco_db /data/scratch/software/funannotate/analysis/funannotate_db/ascomycota --phobius /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/My_species_phobius.results.txt --antismash /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/My_species_antismash.gbk --iprscan /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/My_species_iprscan.xml -s "My species" --signalp /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/My_species_signalp6_prediction_results.txt --eggnog /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/My_species_eggnog.emapper.annotations --isolate "XXXYYY" 2>&1 | tee -a /data/scratch/software/funannotate/analysis/logfiles/my_species_logfile

Logfiles screen log: [Aug 16 01:11 AM]: OS: Ubuntu 18.04, 80 cores, ~ 791 GB RAM. Python: 3.8.13 [Aug 16 01:11 AM]: Running 1.8.15 [Aug 16 01:11 AM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, cre ate one and pass it here '--sbt' [Aug 16 01:11 AM]: Found existing output directory /data/scratch/software/funannotate/analysis/pu t_exs_pred. Warning, will re-use any intermediate files found. [Aug 16 01:11 AM]: Parsing input files [Aug 16 01:11 AM]: Existing tbl found: /data/scratch/software/funannotate/analysis/my_species_pred/p redict_results/My_species.tbl [Aug 16 01:11 AM]: Adding Functional Annotation to My species, NCBI accession: None [Aug 16 01:11 AM]: Annotation consists of: 8,463 gene models [Aug 16 01:11 AM]: 8,419 protein records loaded [Aug 16 01:11 AM]: Existing Pfam-A results found: /data/scratch/software/funannotate/analysis/put _exs_pred/annotate_misc/annotations.pfam.txt [Aug 16 01:11 AM]: 9,360 annotations added [Aug 16 01:11 AM]: Running Diamond blastp search of UniProt DB version 2023_02 [Aug 16 01:11 AM]: 562 valid gene/product annotations from 825 total [Aug 16 01:11 AM]: Existing Eggnog-mapper results found: /data/scratch/software/funannotate/analy sis/my_species_pred/annotate_misc/eggnog.emapper.annotations [Aug 16 01:11 AM]: Parsing EggNog Annotations [Aug 16 01:11 AM]: EggNog version parsed as 2.1.10 [Aug 16 01:11 AM]: 5,050 COG and EggNog annotations added [Aug 16 01:11 AM]: Combining UniProt/EggNog gene and product names using Gene2Product version 1.88 [Aug 16 01:11 AM]: 1,347 gene name and product description annotations added [Aug 16 01:11 AM]: Existing MEROPS results found: /data/scratch/software/funannotate/analysis/put _exs_pred/annotate_misc/annotations.merops.txt [Aug 16 01:11 AM]: 250 annotations added [Aug 16 01:11 AM]: Existing CAZYme results found: /data/scratch/software/funannotate/analysis/put _exs_pred/annotate_misc/annotations.dbCAN.txt [Aug 16 01:11 AM]: 207 annotations added [Aug 16 01:11 AM]: Existing BUSCO2 results found: /data/scratch/software/funannotate/analysis/put _exs_pred/annotate_misc/annotations.busco.txt [Aug 16 01:11 AM]: 1,190 annotations added [Aug 16 01:11 AM]: Existing Phobius results found: /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/phobius.results.txt [Aug 16 01:11 AM]: Existing SignalP results found: /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/signalp.results.txt [Aug 16 01:11 AM]: 470 secretome and 1,622 transmembane annotations added [Aug 16 01:11 AM]: Parsing InterProScan5 XML file [Aug 16 01:11 AM]: Now parsing antiSMASH v6 results, finding SM clusters [Aug 16 01:11 AM]: Found 21 clusters, 75 biosynthetic enyzmes, and 68 smCOGs predicted by antiSMASH [Aug 16 01:12 AM]: Found 109 duplicated annotations, adding 50,682 valid annotations [Aug 16 01:12 AM]: Converting to final Genbank format, good luck! [Aug 16 01:13 AM]: Creating AGP file and corresponding contigs file [Aug 16 01:13 AM]: Cross referencing SM cluster hits with MIBiG database version 1.4 [Aug 16 01:13 AM]: CMD ERROR: diamond blastp --sensitive --query /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/antismash/smcluster.proteins.fasta --threads 10 --out /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/antismash/smcluster.MIBiG.blast.txt --db /data/scratch/software/funannotate/analysis/funannotate_db/mibig.dmnd --max-hsps 1 --evalue 0.001 --max-target-seqs 1 --outfmt 6 [Aug 16 01:13 AM]: diamond v2.0.15.153 (C) Max Planck Society for the Advancement of Science Documentation, support and updates available at http://www.diamondsearch.org Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

CPU threads: 10

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) Temporary directory: /data/scratch/software/funannotate/analysis/my_species_pred/annotate_misc/antismash

Target sequences to report alignments for: 1

Opening the database... [0.033s] Database: /data/scratch/software/funannotate/analysis/funannotate_db/mibig.dmnd (type: Diamond database, sequences: 31023, letters: 18898150) Block size = 2000000000 Opening the input file... [0s] Error: Error detecting input file format. First line seems to be blank.


OS/Install Information

You are running Perl v b'5.032001'. Now checking perl modules... Carp: 1.50 Clone: 0.46 DBD::SQLite: 1.72 DBD::mysql: 4.050 DBI: 1.643 DB_File: 1.855 Data::Dumper: 2.183 File::Basename: 2.85 File::Which: 1.24 Getopt::Long: 2.54 Hash::Merge: 0.302 JSON: 4.10 LWP::UserAgent: 6.67 Logger::Simple: 2.0 POSIX: 1.94 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.13 Tie::File: 1.06 URI::Escape: 5.12 YAML: 1.30 local::lib: 2.000029 threads: 2.25 threads::shared: 1.61 All 27 Perl modules installed

Checking Environmental Variables... $FUNANNOTATE_DB=/data/scratch/software/funannotate/analysis/funannotate_db/ $PASAHOME=/home/admin/anaconda3/envs/funannotate_1.8/opt/pasa-2.5.2 $TRINITY_HOME=/home/admin/anaconda3/envs/funannotate_1.8/opt/trinity-2.8.5 $EVM_HOME=/home/admin/anaconda3/envs/funannotate_1.8/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/home/admin/anaconda3/envs/funannotate_1.8/config/ $GENEMARK_PATH=/home/admin/bin/gmes_linux_64_4 All 6 environmental variables are set

Checking external dependencies... ERROR: gmap found but error running gmap PASA: 2.5.2 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.5.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v35 diamond: 2.0.15 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: 36.3.8g glimmerhmm: 3.0.4 gmes_petap.pl: 4.35 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 17.0.3-internal kallisto: 0.46.1 mafft: v7.508 (2022/Sep/07) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.24-r1122 pigz: 2.4 proteinortho: 6.1.7 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.16.1 snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.11 (Oct 2022) tantan: tantan 40 tbl2asn: 25.8 tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: emapper.py not installed ERROR: gmap not installed ERROR: signalp not installed

taavirit commented 2 months ago

I have an identical issue with version 1.8.17. Have you found a solution?