fbreitwieser / krakenuniq

🐙 KrakenUniq: Metagenomics classifier with unique k-mer counting for more specific results
GNU General Public License v3.0
218 stars 45 forks source link

does not contain necessary file database.kdb #71

Closed ShailNair closed 4 months ago

ShailNair commented 3 years ago

I have downloaded and creatd kraken database successfully..but when I run krakenunique to classify my sequence it shows error saying "krakenuniq: database ("/home/mcs/database/krakenunique/db") does not contain necessary file database.kdb"

This are the files in my database folder: database_0, database_2, database_4, database-build.log , database.jdb, library-files.txt, nt.fna.gz, database_1, database_3, database_5, database.idx, library, nt.fna, taxonomy

There is no.kdb file instead have one.jdb file

Here is the full terminal output

(base) [mcs@mcs1 ~]$ conda activate krakenunique (krakenunique) [mcs@mcs1 ~]$ krakenuniq-download --db /home/mcs/database/krakenu nique/db --threads 75 --dust microbial-nt Storing taxonomy timestamp [date > /home/mcs/database/krakenunique/db/taxonomy/t imestamp] ... done (took 0s) Extracting nodes file [tar -C /home/mcs/database/krakenunique/db/taxonomy -zxvf /home/mcs/database/krakenunique/db/taxonomy/taxdump.tar.gz nodes.dmp 1>&2] ...no des.dmp done (took 2s) /home/mcs/database/krakenunique/db/taxonomy/nodes.dmp check [148.92 MB] Extracting names file [tar -C /home/mcs/database/krakenunique/db/taxonomy -zxvf /home/mcs/database/krakenunique/db/taxonomy/taxdump.tar.gz names.dmp 1>&2] ...na mes.dmp done (took 2s) /home/mcs/database/krakenunique/db/taxonomy/names.dmp check [189.56 MB] : downloading ... gunzipping ... done. : downloading ... done. : downloading ... done. Reading headers from nt file ... Got 60411854 ACs (took 26m14s). Reading taxonomy tree from /home/mcs/database/krakenunique/db/taxonomy/nodes.dmp ... Got 197410 nodes (took 3s). Reading AC to taxonomy ID mapping from /home/mcs/database/krakenunique/db/taxonomy/nucl_gb.accession2taxid.gz ... Done (took 3m15s). Reading AC to taxonomy ID mapping from /home/mcs/database/krakenunique/db/taxonomy/nucl_wgs.accession2taxid.gz ... Done (took 5m37s). Got mappings for 772227 taxa. Writing /home/mcs/database/krakenunique/db/library/nt-bacteria.fna ... Done, wrote 7329767 sequences for 513702 taxa (took 28m55s). /home/mcs/database/krakenunique/db/library/nt-bacteria.fna check [95.75 GB] Masking low-complexity sequences [dustmasker -infmt fasta -in /home/mcs/database/krakenunique/db/library/nt-bacteria.fna -level 20 -outfmt fasta | sed '/^>/! s/[^AGCT]/N/g' > /home/mcs/database/krakenunique/db/library/nt-bacteria-dustmasked.fna.tmp && mv /home/mcs/database/krakenunique/db/library/nt-bacteria-dustmasked.fna.tmp /home/mcs/database/krakenunique/db/library/nt-bacteria-dustmasked.fna] ... done (took 4h1m39s) /home/mcs/database/krakenunique/db/library/nt-bacteria-dustmasked.fna check [95.97 GB] Writing /home/mcs/database/krakenunique/db/library/nt-archaea.fna ... Done, wrote 364016 sequences for 13581 taxa (took 36s). /home/mcs/database/krakenunique/db/library/nt-archaea.fna check [1.34 GB] Masking low-complexity sequences [dustmasker -infmt fasta -in /home/mcs/database/krakenunique/db/library/nt-archaea.fna -level 20 -outfmt fasta | sed '/^>/! s/[^AGCT]/N/g' > /home/mcs/database/krakenunique/db/library/nt-archaea-dustmasked.fna.tmp && mv /home/mcs/database/krakenunique/db/library/nt-archaea-dustmasked.fna.tmp /home/mcs/database/krakenunique/db/library/nt-archaea-dustmasked.fna] ... done (took 3m27s) /home/mcs/database/krakenunique/db/library/nt-archaea-dustmasked.fna check [1.34 GB] Writing /home/mcs/database/krakenunique/db/library/nt-viral.fna ... Done, wrote 2488435 sequences for 210739 taxa (took 5m18s). /home/mcs/database/krakenunique/db/library/nt-viral.fna check [5.90 GB] Masking low-complexity sequences [dustmasker -infmt fasta -in /home/mcs/database/krakenunique/db/library/nt-viral.fna -level 20 -outfmt fasta | sed '/^>/! s/[^AGCT]/N/g' > /home/mcs/database/krakenunique/db/library/nt-viral-dustmasked.fna.tmp && mv /home/mcs/database/krakenunique/db/library/nt-viral-dustmasked.fna.tmp /home/mcs/database/krakenunique/db/library/nt-viral-dustmasked.fna] ... done (took 15m47s) /home/mcs/database/krakenunique/db/library/nt-viral-dustmasked.fna check [5.85 GB] Writing /home/mcs/database/krakenunique/db/library/nt-fungi.fna ... Done, wrote 5303491 sequences for 170750 taxa (took 13m57s). /home/mcs/database/krakenunique/db/library/nt-fungi.fna check [13.19 GB] Masking low-complexity sequences [dustmasker -infmt fasta -in /home/mcs/database/krakenunique/db/library/nt-fungi.fna -level 20 -outfmt fasta | sed '/^>/! s/[^AGCT]/N/g' > /home/mcs/database/krakenunique/db/library/nt-fungi-dustmasked.fna.tmp && mv /home/mcs/database/krakenunique/db/library/nt-fungi-dustmasked.fna.tmp /home/mcs/database/krakenunique/db/library/nt-fungi-dustmasked.fna] ... done (took 35m24s) /home/mcs/database/krakenunique/db/library/nt-fungi-dustmasked.fna check [13.17 GB] Writing /home/mcs/database/krakenunique/db/library/nt-protozoa.fna ... Done, wrote 1766972 sequences for 64610 taxa (took 4m20s). /home/mcs/database/krakenunique/db/library/nt-protozoa.fna check [4.90 GB] Masking low-complexity sequences [dustmasker -infmt fasta -in /home/mcs/database/krakenunique/db/library/nt-protozoa.fna -level 20 -outfmt fasta | sed '/^>/! s/[^AGCT]/N/g' > /home/mcs/database/krakenunique/db/library/nt-protozoa-dustmasked.fna.tmp && mv /home/mcs/database/krakenunique/db/library/nt-protozoa-dustmasked.fna.tmp /home/mcs/database/krakenunique/db/library/nt-protozoa-dustmasked.fna] ... done (took 18m38s) /home/mcs/database/krakenunique/db/library/nt-protozoa-dustmasked.fna check [4.90 GB] (krakenunique) [mcs@mcs1 ~]$ krakenuniq-build -db /home/mcs/database/krakenunique/db --jellyfish-hash-size 10000M --threads 75 --taxids-for-genomes --taxids-for-sequences --max-db-size 500 Found jellyfish v1.1.12 Kraken build set to minimize disk writes. Finding all library files Found 10 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Creating k-mer set (step 1 of 6)... Using jellyfish K-mer set created. [9h18m35.735s] Skipping step 2, database reduction unnecessary. Sorting k-mer set (step 3 of 6)... db_sort: Getting database into memory ...Loaded database with 37286661928 keys with k of 31 [val_len 4, key_len 8]. Loaded database with 37286661928 keys with k of 31 [val_len 4, key_len 8]. db_sort: Sorting .../home/mcs/miniconda3/envs/krakenunique/libexec/build_db.sh: ÐÐ 46: 8358 ÒÑɱËÀ db_sort -z -M -t 75 -n 15 -d database.jdb -o database0.kdb.tmp -i database.idx (krakenunique) [mcs@mcs1 ~]$ krakenuniq --preload --db /home/mcs/database/krakenunique/db --threads 75 ¨paired --fastq-input /home/mcs/gene/shail/metagenomics/metagenomics/analysis/ORIGINAL/01_HOST_REMOVAL/original_1_host_removed_r1.fastq /home/mcs/gene/shail/metagenomics/metagenomics/analysis/ORIGINAL/01_HOST_REMOVAL/original_1_host_removed_r2.fastq --output /home/mcs/gene/shail/metagenomics/metagenomics/analysis/ORIGINAL/01_HOST_REMOVAL/taxonomy/krakenunique/original_1_krakenunq.reads} --report-file home/mcs/gene/shail/metagenomics/metagenomics/analysis/ORIGINAL/01_HOST_REMOVAL/taxonomy/krakenunique/original_1_krakenunq krakenuniq: database ("/home/mcs/database/krakenunique/db") does not contain necessary file database.kdb

kmmehjabin commented 4 months ago

Did you figure out the solution? I have the same problem.

ShailNair commented 4 months ago

@kmmehjabin cleaning (--clean) and re-building worked for me

christopherwilliamlee commented 1 month ago

Hello everyone,

I know it's been a long time since @ShailNair faced this issue. However, others might encounter the same problem. To fix it, you just need to download the .kdb file from https://benlangmead.github.io/aws-indexes/k2 in the KrakenUniq indexes.