I have built custom database which contains the entire GTDB as well as unique MAGs I generated ~ 67,000 genomes. The hash.k2d is 305Gb in size. When I try and classify reads kraken2 cannot even load the database:
This was the output from when I built the database:
Creating sequence ID to taxonomy ID map (step 1)...
Sequence ID to taxonomy ID map complete. [2.070s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 312273650832 bytes
Capacity estimation complete. [11m56.324s]
Building database files (step 3)...
Taxonomy parsed and converted.
CHT created with 18 bits reserved for taxid.
Completed processing of 7746034 sequences, 202613488062 bp
Writing data to disk... complete.
Database files completed. [12h3m44.879s]
Database construction complete. [Total: 12h15m47.164s]
I am presuming I do not have enough RAM to classify. It is unlikely I will be able to use a server with more RAM.
I have built custom database which contains the entire GTDB as well as unique MAGs I generated ~ 67,000 genomes. The hash.k2d is 305Gb in size. When I try and classify reads kraken2 cannot even load the database:
kraken2 --db Custom_Database --threads 30 --gzip-compressed --output Output/Sample.output.txt --report Output/Sample.report.txt --report-zero-counts --paired Reads/Sample.R1.fastq Reads/Sample.R2.fastq Loading database information..........Killed.
kraken2 --version Kraken version 2.1.2
kraken2-inspect --skip-counts --db Custom_Database Loading database information..........Killed.
grep MemTotal /proc/meminfo MemTotal: 394816212 kB
This was the output from when I built the database:
Creating sequence ID to taxonomy ID map (step 1)... Sequence ID to taxonomy ID map complete. [2.070s] Estimating required capacity (step 2)... Estimated hash table requirement: 312273650832 bytes Capacity estimation complete. [11m56.324s] Building database files (step 3)... Taxonomy parsed and converted. CHT created with 18 bits reserved for taxid. Completed processing of 7746034 sequences, 202613488062 bp Writing data to disk... complete. Database files completed. [12h3m44.879s] Database construction complete. [Total: 12h15m47.164s]
I am presuming I do not have enough RAM to classify. It is unlikely I will be able to use a server with more RAM.
Is there any other way I can make this work?
Help is appreciated. Many thanks