DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
733 stars 274 forks source link

Could not build the fungi database #726

Open vmevada102 opened 1 year ago

vmevada102 commented 1 year ago

I am trying to use the prebuilt Fungal database for the fungal diversity study.

I ran the following command. kraken2-build --download-library fungi --db fungi

After completion of the data processing. I am still not able to find the taxo.k2d file in the specific folder.

The program terminated with following error. Kraken2: database ("./fungi") does not contain necessary file taxo.k2d

After consecutive efforts and Download taxonomy files/folder, I am again getting another error

Command run : kraken2-build --build --db fungi --threads 46

Error. : ..../build_kraken2_db.sh: line 105: KRAKEN2_LOAD_FACTOR: unbound variable

nicolo-tellini commented 1 year ago

Hello @vmevada102 ,

Did you build the DB after downloading the library?

kraken2-build --build --db $DBNAME

Here the manual: step3 build the database.

best,

nic

cement-head commented 2 months ago

Try this:

$ kraken2-build --download-library fungi --db fungi
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
Processed 129 projects (2260 sequences, 3.54 Gbp)... done.
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.

$ kraken2-build --download-taxonomy --db fungi
Downloading nucleotide gb accession to taxon map... done.
Downloading nucleotide wgs accession to taxon map... done.
Downloaded accession to taxon map(s)
Downloading taxonomy tree data... done.
Uncompressing taxonomy data... done.
Untarring taxonomy tree data... done.

$ kraken2-build --build --db fungi --threads 12
Creating sequence ID to taxonomy ID map (step 1)...
Sequence ID to taxonomy ID map complete. [0.448s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 5207715840 bytes
Capacity estimation complete. [42.962s]
Building database files (step 3)...
Taxonomy parsed and converted.
CHT created with 9 bits reserved for taxid.
Completed processing of 2260 sequences, 3544237107 bp
Writing data to disk...  complete.
Database files completed. [10m28.498s]
Database construction complete. [Total: 11m11.929s]