torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
656 stars 122 forks source link

Feature request --selfmap filename or --self optional filename or etc #461

Open crosenth opened 2 years ago

crosenth commented 2 years ago

I use vsearch usearch_global --self a lot on a 16s database and it works great. But sometimes I get a sequence match to a 16s allele from the same genome as my query sequence. A --selfmap filename argument would be great to group sequences by accession or genome or whatever so they couldn't match each other. The file could be a headerless csv: _label,genomelabel. One additional specification would be label could appear multiple times in a map fle to multiple _genomelabel groups. In other words, a sequence could be a member of multiple genomes which is weird but also useful.

Thanks!