TaxTriage is a Nextflow workflow designed to agnostically identify and classify microbial organisms within short- or long-read metagenomic NGS data. This flexible tool was developed with various use-cases of mNGS in mind.
MIT License
18
stars
4
forks
source link
Use local Refseq mirror for post-kraken2 accession querying #61
Currently, you have to either pull (internet access required) taxa from NCBI post-kraken2, manually pull (also internet-requiring) AND/OR skip kraken2 and use a local reference FASTA file. However, some would want to just use a full Refseq set of FASTA files (many directories/subdirectories) and extract those for the realignment step.
Considerations:
Read the headers of all files is too unwieldy.
Consider supplying a mapping file of accession to file OR extract accession from the filename itself like GCF....fasta
Description of feature
Currently, you have to either pull (internet access required) taxa from NCBI post-kraken2, manually pull (also internet-requiring) AND/OR skip kraken2 and use a local reference FASTA file. However, some would want to just use a full Refseq set of FASTA files (many directories/subdirectories) and extract those for the realignment step.
Considerations: Read the headers of all files is too unwieldy. Consider supplying a mapping file of accession to file OR extract accession from the filename itself like GCF....fasta