DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
686 stars 267 forks source link

failed to build db #690

Closed nashanghenzan closed 1 year ago

nashanghenzan commented 1 year ago

when I write kraken2-build --build --threads 15 --db ./, finally I just got two files in my directory: seqid2taxid.map and taxo.k2d.tmp. I don't know why it happened, the fallowing is detail.

Creating sequence ID to taxonomy ID map (step 1)...
Sequence ID to taxonomy ID map complete. [0.281s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 68276765256 bytes
Capacity estimation complete. [39m3.126s]
Building database files (step 3)...
Taxonomy parsed and converted.
xargs: cat: terminated by signal 13
/root/miniconda3/libexec/build_kraken2_db.sh: line 143: 35914 Done                    list_sequence_files
     35915 Exit 125                | xargs -0 cat
     35916 Killed                  | build_db -k $KRAKEN2_KMER_LEN -l $KRAKEN2_MINIMIZER_LEN -S $KRAKEN2_SEED_TEMPLATE $KRAKEN2XFLAG -H hash.k2d.tmp -t taxo.k2d.tmp -o opts.k2d.tmp -n taxonomy/ -m $seqid2taxid_map_file -c $required_capacity -p $KRAKEN2_THREAD_CT $max_db_flag -B $KRAKEN2_BLOCK_SIZE -b $KRAKEN2_SUBBLOCK_SIZE -r $KRAKEN2_MIN_TAXID_BITS $fast_build_flag
ryan-preble commented 1 year ago

I have also had issues with failing to build a database, though for different reasons. On a ubuntu machine with 64 cores and 500 GB RAM, I ran Kraken2/kraken2-build --standard --db krakendb --threads 40. The output from nohup.out is as follows:

Downloading nucleotide gb accession to taxon map... done. Downloading nucleotide wgs accession to taxon map... done. Downloaded accession to taxon map(s) Downloading taxonomy tree data... done. Uncompressing taxonomy data... done. Untarring taxonomy tree data... done. Step 1/2: Performing rsync file transfer of requested files Rsync file transfer complete. Step 2/2: Assigning taxonomic IDs to sequences All files processed, cleaning up extra sequence files... done, library complete. Masking low-complexity regions of downloaded library... done. Step 1/2: Performing rsync file transfer of requested files Rsync file transfer complete. Step 2/2: Assigning taxonomic IDs to sequences All files processed, cleaning up extra sequence files... done, library complete. Masking low-complexity regions of downloaded library... done. Step 1/2: Performing rsync file transfer of requested files Rsync file transfer complete. Step 2/2: Assigning taxonomic IDs to sequences All files processed, cleaning up extra sequence files... done, library complete. Masking low-complexity regions of downloaded library... done. Downloading plasmid files from FTP...awk: fatal: cannot open file.listing' for reading (No such file or directory)`

Everything appears to have worked until the final line. Is .listing a temporary file being deleted prematurely?

nashanghenzan commented 1 year ago

that final codes just show the final codes in build_kraken2_db.sh script,actually i didn't see that files in my directory.

ryan-preble commented 1 year ago

Is .listing a file that used to be obtained through FTP? Seeing as NCBI FTP has undergone changes recently, does this need to be changed?

nashanghenzan commented 1 year ago

thanks for your reply, today i find that i can build database when i remove bacteria(141G). it maybe because i haven't enough memory before.

nashanghenzan commented 1 year ago

because i just have 64G ram, so i finally build db successfully for use --max-db-size 51