torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
656 stars 122 forks source link

More than 7 levels in sintax #498

Open lokeshbio opened 1 year ago

lokeshbio commented 1 year ago

At the moment the sintax runs only work with default 7 levels, going from kingdom to species. It would be great if it could also take more levels containing strains or bins and so on!

torognes commented 1 year ago

The sintax command in vsearch currently accepts the following 8 levels:

If we should include further levels, I think it must adhere to some standard with additional letters for those levels. Are there any such standards?

lokeshbio commented 1 year ago

Thanks for the reply @torognes! Seems like it is maximum 8, rather than 7 as I thought! But, it doesnt seem like I could find any information of adding sub-levels or any such standards from the documentation here

todd-desantis commented 1 year ago

The StrainSelect database (https://pubmed.ncbi.nlm.nih.gov/36814618/) uses 't' as the 8th level to designate strain. Taxonomy and sequence files are available here if you decide to extend the functionality for vsearch --sintax https://greengenes.secondgenome.com/?prefix=downloads/strainselect_database/StrainSelect21/

torognes commented 5 months ago

I have added a ninth taxonomy level, strain (t), in vsearch 2.28.1, just released. It also includes other major improvements to the sintax command.