genomic-medicine-sweden / taxprofiler

Taxonomic profiling of shotgun metagenomic data
https://nf-co.re/taxprofiler
MIT License
0 stars 1 forks source link

Construct databases for all profilers from the same input refseq data #33

Open LilyAnderssonLee opened 1 year ago

LilyAnderssonLee commented 1 year ago

Construct databases using the same refseq data as the kraken2 database which includes

kraken2 database can be downloaded from https://benlangmead.github.io/aws-indexes/k2

The list of refseq sequences is stored in the tsv file: https://genome-idx.s3.amazonaws.com/kraken/pluspf_20231009/library_report.tsv

Update databases whenever the kraken2 database is updated.

This issue can be closed after construction of these databases:

LilyAnderssonLee commented 11 months ago

kaiju The db construction failed due the memory limit, 450GB on hasta. This database was built on UPPMAX Bianca with 512GB memory.