DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
714 stars 271 forks source link

Ewquest to provide Kraken2 index for the FDA-ARGOS #839

Open Rohit-Satyam opened 3 months ago

Rohit-Satyam commented 3 months ago

Dear @dfornika @BenLangmead

Greetings!! Is it possible to provide FDA-ARGOS kraken2 indexes? The FDA-ARGOS Nature Com article curated some highly reliable genomes and is getting quite popular.

jenniferlu717 commented 2 months ago

Looking into this

ChillarAnand commented 1 month ago

Any update on this? @jenniferlu717

I am planning to build the same index as well.

ChillarAnand commented 1 month ago

I have written a script to download all argos genbank files, convert them to fasta and add them to kraken library.

I haven't used genbank till now and I don't know how to use taxonomy data for genbank.

sounkou-bioinfo commented 1 month ago

Any update on this @jenniferlu717 ? @ChillarAnand is it possible to share the script ?

ChillarAnand commented 1 month ago

I haven't looked into genbank taxonomy yet.

I have created kraken-db-builder to ease the process of building database.

https://avilpage.com/kdb.html

Argos provides a txt file which has accession list. Using that we can download all fasta files into a directory and then run kraken2-build or use kdb command.

ChillarAnand commented 1 month ago

I finally managed to build the index. Complete tutorial and various files to download are available here.

https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html

sounkou-bioinfo commented 1 month ago

I finally managed to build the index. Complete details and various files to download are available here.

https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html

Great tutorial @ChillarAnand ! To make it complete, i would probably add human library just in case host trimming was not complete.

Rohit-Satyam commented 3 weeks ago

I agree with @sounkou-bioinfo. Are there bracken indexes too?