rx32940 / Lepto-Metagenomics

3 stars 0 forks source link

KRAKEN2 with complete database #3

Open rx32940 opened 4 years ago

rx32940 commented 4 years ago
rx32940 commented 4 years ago

Kraken2 standard library: default kmer length 35, minimizer length 31. NCBI taxonomy, bacterial, archaeal, and viral domains, human and a collection of known vectors (UniVec_Core) Kraken2 Custom library: bacteria, archaea, viral and rat reference.

rx32940 commented 4 years ago

run kraken2/bracken in phylum and genus level: dir for kraken2 output with standard db:

/scratch/rx32940/kraken/output/kraken_out

dir for kraken2 output with standard db after bracken estimation:

/scratch/rx32940/kraken/output/bracken_out

dir for kraken2 output with custom db:

/scratch/rx32940/kraken/output/custom_out

dir for kraken2 output with custom db after bracken estimation:

/scratch/rx32940/kraken/output/custom_bracken

code used to run kraken2 and bracken in phylum and genus levels

rx32940 commented 4 years ago

output in Dropbox:

$HOME/Dropbox/5. Rachel's projects/Metagenomic_Analysis/Kraken2-standard/standard/phylum(genus)/results

output on sapelo2:

/scratch/rx32940/kraken/output/bracken_out/genus(phylum)
  Classfied (%) Unclassfied (%)
R22.K 46.85 53.15
R22.L 24.93 75.07
R22.S 46.44 53.56
R26.K 52.83 47.17
R26.L 33.54 66.46
R26.S 47.44 52.56
R27.K 56.64 43.36
R27.L 35.5 74.5
R27.S 42.27 57.73
R28.K 25.73 74.27
R28.L 28.1 71.9
R28.S 26.79 73.21
rx32940 commented 4 years ago

with the custom database, rattus replaced homo became the taxa with the highest reads abundance. the percentage of total reads been classified has also increased significantly.

output in Dropbox:

$HOME/Dropbox/5. Rachel's projects/Metagenomic_Analysis/Kraken2-standard/custom/phylum(genus)/results

output on sapelo2:

/scratch/rx32940/kraken/output/custom_bracken/genus(phylum)
  Classfied (%) Unclassfied (%)
R22.K 70.92 29.08
R22.L 30.43 69.57
R22.S 62.96 37.04
R26.K 70.03 29.97
R26.L 44.85 55.15
R26.S 63.24 36.76
R27.K 69.66 30.34
R27.L 32.45 67.55
R27.S 61.73 38.27
R28.K 86.29 13.71
R28.L 83.42 16.58
R28.S 83.75 16.25
rx32940 commented 4 years ago

compare the improvement with using the minikraken2 library:

Sample ID Classified Unclassified
R22.K 14.72% 85.28%
R22.L 6.03% 93.97%
R22.S 13.46% 86.54%
R26.K 14.45% 85.55%
R26.L 7.55% 92.45%
R26.S 10.83% 89.17%
R27.K 13.85% 86.15%
R27.L 6.62% 93.38%
R27.S 10.89% 89.11%
R28.K 8.58% 91.42%
R28.L 7.45% 92.55%
R28.S 6.52% 93.48%
rx32940 commented 4 years ago

Relative Abundance of each genus identified in the metagenomic samples with custom Kraken2 library

Absolute abundance of each genus identified in the metagenomic samples with custom Kraken2 library genus_absolute_top10

rx32940 commented 4 years ago
rx32940 commented 4 years ago

Relative and absolute abundance

kraken2_phylum_relative_custom

kraken2_phylum_absolute_custom

rx32940 commented 4 years ago

code for barplots in the last two issues

rx32940 commented 4 years ago
/Users/rx32940/Dropbox/5.Rachel-projects/Metagenomic_Analysis/Kraken2-standard/custom/genus/top_20_genus_for_each_sample.csv