Closed liuyca1 closed 2 years ago
what is output of ls -l $FUNANNOTATE_DB
, seems like during funannotate setup
something didn't properly create a database file needed for annotate.
what is output of
ls -l $FUNANNOTATE_DB
, seems like duringfunannotate setup
something didn't properly create a database file needed for annotate.
$ls -l $FUNANNOTATE_DB
total 57018900
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 14 2017 actinopterygii
-rw-rw-r-- 1 liuyuanchao liuyuanchao 220677450 Nov 6 11:01 actinopterygii.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Nov 18 2016 alveolata_stramenophiles
-rw-rw-r-- 1 liuyuanchao liuyuanchao 10551644 Nov 6 10:42 alveolata_stramenophiles.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Nov 1 2016 arthropoda
-rw-rw-r-- 1 liuyuanchao liuyuanchao 43933198 Nov 6 10:45 arthropoda.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 ascomycota
-rw-rw-r-- 1 liuyuanchao liuyuanchao 67966037 Nov 6 10:27 ascomycota.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 aves
-rw-rw-r-- 1 liuyuanchao liuyuanchao 137974970 Nov 6 11:08 aves.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 basidiomycota
-rw-rw-r-- 1 liuyuanchao liuyuanchao 68863784 Nov 6 10:41 basidiomycota.tar.gz
drwxrwxr-x 3 liuyuanchao liuyuanchao 4096 Jul 26 18:36 busco_db
-rw-r--r-- 1 liuyuanchao liuyuanchao 2374032 Nov 8 11:59 busco_outgroups.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 8098 Nov 10 16:46 dbCAN.changelog.txt
-rw-rw-r-- 1 liuyuanchao liuyuanchao 63489 Nov 10 16:46 dbCAN-fam-HMMs.txt
-rw-rw-r-- 1 liuyuanchao liuyuanchao 94317882 Nov 10 16:46 dbCAN.hmm
-rw-r--r-- 1 liuyuanchao liuyuanchao 17191104 Nov 10 16:46 dbCAN.hmm.h3f
-rw-r--r-- 1 liuyuanchao liuyuanchao 29869 Nov 10 16:46 dbCAN.hmm.h3i
-rw-r--r-- 1 liuyuanchao liuyuanchao 39279761 Nov 10 16:46 dbCAN.hmm.h3m
-rw-r--r-- 1 liuyuanchao liuyuanchao 46144080 Nov 10 16:46 dbCAN.hmm.h3p
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Jul 23 16:08 dikarya
-rw-r--r-- 1 liuyuanchao liuyuanchao 66199252 Jul 23 16:08 dikarya.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 diptera
-rw-rw-r-- 1 liuyuanchao liuyuanchao 145735505 Nov 6 10:55 diptera.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 41370988544 Mar 2 2021 eggnog.db
-rw-rw-r-- 1 liuyuanchao liuyuanchao 32817522 Aug 3 15:31 eggnog.db.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 9285439161 Mar 2 2021 eggnog_proteins.dmnd
-rw-r--r-- 1 liuyuanchao liuyuanchao 278003712 Nov 11 2020 eggnog.taxa.db
-rw-r--r-- 1 liuyuanchao liuyuanchao 6628719 Nov 11 2020 eggnog.taxa.db.traverse.pkl
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 14 2017 embryophyta
-rw-rw-r-- 1 liuyuanchao liuyuanchao 64919077 Nov 6 11:25 embryophyta.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 endopterygota
-rw-rw-r-- 1 liuyuanchao liuyuanchao 118029754 Nov 6 10:48 endopterygota.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 euarchontoglires
-rw-rw-r-- 1 liuyuanchao liuyuanchao 315719027 Nov 6 11:18 euarchontoglires.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Nov 2 2016 eukaryota
-rw-rw-r-- 1 liuyuanchao liuyuanchao 13244593 Nov 6 10:42 eukaryota.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 eurotiomycetes
-rw-rw-r-- 1 liuyuanchao liuyuanchao 210241744 Nov 6 10:34 eurotiomycetes.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 473 Nov 9 08:14 funannotate-annotate.136510.log
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1900 Aug 4 20:23 funannotate-annotate.36075.log
-rw-rw-r-- 1 liuyuanchao liuyuanchao 420 Nov 9 14:58 funannotate-annotate.65436.log
-rw-rw-r-- 1 liuyuanchao liuyuanchao 473 Nov 9 11:54 funannotate-annotate.66655.log
-rw-rw-r-- 1 liuyuanchao liuyuanchao 420 Nov 9 15:01 funannotate-annotate.68022.log
-rw-r--r-- 1 liuyuanchao liuyuanchao 1191 Nov 19 09:01 funannotate-db-info.txt
-rw-r--r-- 1 liuyuanchao liuyuanchao 11017103 Mar 2 2016 funannotate.repeat.proteins.fa
-rw-rw-r-- 1 liuyuanchao liuyuanchao 6325661 Nov 8 11:53 funannotate.repeat.proteins.fa.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 11017079 Nov 8 11:53 funannotate.repeats.reformat.fa
-rw-rw-r-- 1 liuyuanchao liuyuanchao 598 Nov 9 15:06 funannotate-setup.log
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 fungi
-rw-rw-r-- 1 liuyuanchao liuyuanchao 12673693 Nov 6 10:25 fungi.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 33814749 Nov 9 10:11 go.obo
drwxrwxr-x 3 liuyuanchao liuyuanchao 30 Aug 4 17:33 hmmer
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 hymenoptera
-rw-rw-r-- 1 liuyuanchao liuyuanchao 233690214 Nov 6 10:52 hymenoptera.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 insecta
-rw-rw-r-- 1 liuyuanchao liuyuanchao 67256544 Nov 6 10:46 insecta.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 2102658 Nov 10 17:47 interpro.tsv
-rw-rw-r-- 1 liuyuanchao liuyuanchao 193864157 Nov 10 17:47 interpro.xml
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 laurasiatheria
-rw-rw-r-- 1 liuyuanchao liuyuanchao 286089494 Nov 6 11:23 laurasiatheria.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 mammalia
-rw-rw-r-- 1 liuyuanchao liuyuanchao 262985539 Nov 6 11:13 mammalia.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1445045 Nov 19 09:01 merops.dmnd
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1383413 Nov 19 09:01 merops.formatted.fa
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1957603 Nov 19 09:01 merops_scan.lib
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 metazoa
-rw-rw-r-- 1 liuyuanchao liuyuanchao 39476850 Nov 6 10:43 metazoa.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 21740858 Nov 8 11:54 mibig.dmnd
-rw-rw-r-- 1 liuyuanchao liuyuanchao 21244378 Nov 8 11:54 mibig.fa
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 microsporidia
-rw-rw-r-- 1 liuyuanchao liuyuanchao 21181462 Nov 6 10:26 microsporidia.tar.gz
drwxrwxr-x 2 liuyuanchao liuyuanchao 4096 Aug 4 17:32 mmseqs
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1363545 Nov 6 17:38 ncbi_cleaned_gene_products.txt
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 nematoda
-rw-rw-r-- 1 liuyuanchao liuyuanchao 45483712 Nov 6 10:44 nematoda.tar.gz
drwxr-xr-x 2 liuyuanchao liuyuanchao 4096 Dec 5 2016 outgroups
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 pezizomycotina
-rw-rw-r-- 1 liuyuanchao liuyuanchao 164636032 Nov 6 10:30 pezizomycotina.tar.gz
drwxrwxr-x 2 liuyuanchao liuyuanchao 4096 Aug 3 16:30 pfam
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1114025 Nov 10 17:46 Pfam-A.clans.tsv
-rw-rw-r-- 1 liuyuanchao liuyuanchao 1538737879 Nov 10 17:46 Pfam-A.hmm
-rw-r--r-- 1 liuyuanchao liuyuanchao 351982037 Nov 10 17:46 Pfam-A.hmm.h3f
-rw-r--r-- 1 liuyuanchao liuyuanchao 1323456 Nov 10 17:46 Pfam-A.hmm.h3i
-rw-r--r-- 1 liuyuanchao liuyuanchao 636978544 Nov 10 17:46 Pfam-A.hmm.h3m
-rw-r--r-- 1 liuyuanchao liuyuanchao 749322833 Nov 10 17:46 Pfam-A.hmm.h3p
-rw-rw-r-- 1 liuyuanchao liuyuanchao 111 Nov 10 17:46 Pfam.version
-rw-r--r-- 1 liuyuanchao liuyuanchao 581185 Aug 7 2018 protein.evidence.fasta
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Nov 18 2016 protists
-rw-rw-r-- 1 liuyuanchao liuyuanchao 9459518 Nov 6 10:42 protists.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 11090231 Nov 8 11:53 repeats.dmnd
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 saccharomycetales
-rw-rw-r-- 1 liuyuanchao liuyuanchao 72402218 Nov 6 10:40 saccharomycetales.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 saccharomycetes
-rw-rw-r-- 1 liuyuanchao liuyuanchao 91857381 Nov 6 10:39 saccharomycetes.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 sordariomycetes
-rw-rw-r-- 1 liuyuanchao liuyuanchao 192011468 Nov 6 10:37 sordariomycetes.tar.gz
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 tetrapoda
-rw-rw-r-- 1 liuyuanchao liuyuanchao 209110833 Nov 6 11:05 tetrapoda.tar.gz
drwxr-xr-x 110 liuyuanchao liuyuanchao 4096 Jul 23 16:08 trained_species
-rw-rw-r-- 1 liuyuanchao liuyuanchao 280362561 Nov 9 09:21 uniprot_sprot.fasta
drwxr-xr-x 5 liuyuanchao liuyuanchao 4096 Feb 13 2017 vertebrata
-rw-rw-r-- 1 liuyuanchao liuyuanchao 137371990 Nov 6 10:57 vertebrata.tar.gz
-rw-rw-r-- 1 liuyuanchao liuyuanchao 5255536 Aug 3 20:55 wget-log
We also probably know that it is a database problem, but we don’t know how to solve it
Looks like its the uniprot database that is missing or rather the diamond database is missing, should be able to fix with:
funannotate setup -i uniprot --force --wget
This is the check in the code: https://github.com/nextgenusfs/funannotate/blob/master/funannotate/annotate.py#L453-L454
[Nov 19 11:51 AM]: OS: CentOS Linux 7, 160 cores, ~ 958 GB RAM. Python: 3.9.7 [Nov 19 11:51 AM]: Running 1.8.7 [Nov 19 11:51 AM]: Database location: /data/liuyuanchao/funannotate_test/all_database [Nov 19 11:51 AM]: Retrieving download links from GitHub Repo [Nov 19 11:56 AM]: Parsing Augustus pre-trained species and porting to funannotate [Nov 19 11:56 AM]: Downloading UniProtKB/SwissProt database --2021-11-19 11:56:14-- ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz => ‘/data/liuyuanchao/funannotate_test/all_database/uniprot_sprot.fasta.gz’ Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.197.74 Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.197.74|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/databases/uniprot/current_release/knowledgebase/complete ... done. ==> SIZE uniprot_sprot.fasta.gz ... 90527596 ==> PASV ... done. ==> RETR uniprot_sprot.fasta.gz ... done. Length: 90527596 (86M) (unauthoritative)
uniprot_sprot.fasta.gz 100%[====================================================================================================================>] 86.33M 749KB/s in 6m 22s
2021-11-19 12:02:39 (231 KB/s) - ‘/data/liuyuanchao/funannotate_test/all_database/uniprot_sprot.fasta.gz’ saved [90527596]
--2021-11-19 12:02:41-- ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/reldate.txt => ‘/data/liuyuanchao/funannotate_test/all_database/uniprot.release-date.txt’ Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.197.74 Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.197.74|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/databases/uniprot/current_release/knowledgebase/complete ... done. ==> SIZE reldate.txt ... 151 ==> PASV ... done. ==> RETR reldate.txt ... done. Length: 151 (unauthoritative)
reldate.txt 100%[====================================================================================================================>] 151 --.-KB/s in 0.003s
2021-11-19 12:02:44 (53.3 KB/s) - ‘/data/liuyuanchao/funannotate_test/all_database/uniprot.release-date.txt’ saved [151]
[Nov 19 12:02 PM]: Building diamond database [Nov 19 12:02 PM]: UniProtKB Database: version=2021_04 date=2021-11-17 records=565,928
[Nov 19 12:04 PM]: OS: CentOS Linux 7, 160 cores, ~ 958 GB RAM. Python: 3.9.7 [Nov 19 12:04 PM]: Running 1.8.7 [Nov 19 12:04 PM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Nov 19 12:04 PM]: Parsing input files [Nov 19 12:04 PM]: Existing tbl found: ./fun/predict_results/Laccaria_bicolor.tbl [Nov 19 12:05 PM]: Adding Functional Annotation to Laccaria bicolor, NCBI accession: None [Nov 19 12:05 PM]: Annotation consists of: 14,640 gene models [Nov 19 12:05 PM]: 14,295 protein records loaded [Nov 19 12:05 PM]: Running HMMer search of PFAM version 34.0 [Nov 19 12:06 PM]: 10,314 annotations added [Nov 19 12:06 PM]: Running Diamond blastp search of UniProt DB version 2021_04 [Nov 19 12:07 PM]: 453 valid gene/product annotations from 624 total [Nov 19 12:07 PM]: Running Eggnog-mapper
continue.....
we finished gene predict by "funannotate predict" and "funannotate iprscan", but when we use "funannotate annotate", An error message appeared: $funannotate annotate -i ./fun -d /data/liuyuanchao/funannotate_test/all_database/ --cpus 48
[Nov 19 11:29 AM]: OS: CentOS Linux 7, 160 cores, ~ 958 GB RAM. Python: 3.9.7 [Nov 19 11:29 AM]: Running 1.8.7 [Nov 19 11:29 AM]: Database files not found in /data/liuyuanchao/funannotate_test/all_database/, run funannotate database and/or funannotate setup
$funannotate database
Funannotate Databases currently installed:
Database Type Version Date Num_Records Md5checksum
merops diamond 12.0 2017-10-04 5009 a6dd76907896708f3ca5335f58560356 uniprot diamond 2021_03 2021-06-02 565254 68ed1e475d13bb3d5574c53822d11cd3 dbCAN hmmer3 9.0 2020-08-04 641 04696dfba1c3bb82ff9b72cfbb3e4a65 pfam hmmer3 34.0 2021-03 19179 f83c0d00445257fd9c066ad3e9e10568 repeats diamond 1.0 2021-11-08 11950 4e8cafc3eea47ec7ba505bb1e3465d21 go text 2021-10-26 2021-10-26 47226 6757c819642e79e1406cad3ffcb6ea3d mibig diamond 1.4 2021-11-08 31023 118f2c11edde36c81bdea030a0228492 interpro xml 86.0 2021-06-03 38913 0d8c575f88f397397b9491520b38db1e busco_outgroups outgroups 1.0 2021-11-08 8 6795b1d4545850a4226829c7ae8ef058 gene2product text 1.72 2021-10-18 34111 d844fe60a5ab66e07f884da1cc08f16c
To update a database type: funannotate setup -i DBNAME -d /data/liuyuanchao/funannotate_test/all_database --force
To see install BUSCO outgroups type: funannotate database --show-outgroups
To see BUSCO tree type: funannotate database --show-buscos
$funannotate check --show-versions
Checking dependencies for 1.8.7
You are running Python v 3.9.7. Now checking python packages... biopython: 1.79 goatools: 1.1.6 matplotlib: 3.4.3 natsort: 8.0.0 numpy: 1.21.4 pandas: 1.3.4 psutil: 5.8.0 requests: 2.26.0 scikit-learn: 1.0.1 scipy: 1.7.0 seaborn: 0.11.2 All 11 python packages installed
You are running Perl v b'5.026002'. Now checking perl modules... Bio::Perl: 1.007002 Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed
Checking Environmental Variables... $FUNANNOTATE_DB=/data/liuyuanchao/funannotate_test/all_database $PASAHOME=/opt/anaconda3/envs/funannotate/opt/pasa-2.4.1 $TRINITY_HOME=/opt/anaconda3/envs/funannotate/opt/trinity-2.8.5 $EVM_HOME=/opt/anaconda3/envs/funannotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/opt/anaconda3/envs/funannotate/config/ ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
Checking external dependencies... PASA: 2.4.1 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.3.3 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v36 diamond: 2.0.8 emapper.py: 2.1.3 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2018-07-04 gmes_petap.pl: 4.68_lic hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 11.0.8-internal kallisto: 0.46.1 mafft: v7.490 (2021/Oct/30) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.22-r1101 proteinortho: 6.0.31 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.10 signalp: 5.0b snap: 2006-07-28 stringtie: 2.1.7 tRNAscan-SE: 2.0.9 (July 2021) tantan: tantan 26 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 All 36 external dependencies are installed