Open timghaly opened 2 months ago
Sorry, I forgot to include the commands that I used for downloading the taxonomy and reference library before building the kraken2 nr database were the following:
kraken2-build --download-taxonomy --db nr --threads 1
and
kraken2-build --download-library nr --protein --db nr --threads 1
@timghaly have you classify your samples with a less complex database? Maybe use refseq indexes and see if the percentage of unclassify reads are the same. Additionally to diagnosticate if the database building had any errors should be worth run the command "kraken2-inspect --db /path_to_your_db" and see what is actually there.
Hahaha, I just had the exact same problem as you, and after bothering me for a week I found out why, and an extremely stuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuupid reason.!!!!!!!!!!!!!!!!!!
In the index folder you created, you will see that your hash.k2d file is only 137GB, but unmapped.txt is a whopping 12GB.
So the problem is obvious.
You need to add the --protein option to taxonomy when downloading (Example: kraken2-build --download-taxonomy --db nr --threads 1 --protein ).
After adding --protein, the software downloads prot.accession2taxid.gz instead of nucl_gb.accession2taxid and nucl_wgs.accession2taxid.
Thanks @callAgene , that is exactly what has happened. Thanks for pointing that out for me!
Hey kraken2 team, thanks for this tool.
I am trying to classify reads using the nr database, but am finding ~99% of reads are unclassified, and only eukaryote hits (~1%) are being classified.
I built the nr database using the following command:
unset OMP_NUM_THREADS
kraken2-build \ --build \ --protein \ --db nr \ --threads 48 \
I am then attempting to classify with the following:
kraken2 \ --threads 48 \ --use-names \ --report "kraken2/G2677.k2_nr_report.txt" \ --db ~/databases/kraken2/nr \ --gzip-compressed \ --paired \ "fastp/G2677.trimmed.R1.fq.gz" "fastp/G2677.trimmed.R2.fq.gz" \
I have also run kraken2 on a ONT metagenome with the same command (except without the --paired argument), but am also getting the same results.
I am presuming something went wrong during the kraken2-build stage. Any help would be greatly appreciated.