iTaxoTools / TaxI2

Calculation and analysis of pairwise sequence distances
GNU General Public License v3.0
0 stars 0 forks source link

Add new program mode: decontaminate #21

Closed mvences closed 3 years ago

mvences commented 3 years ago

This would be one further program mode in addition to "Compare all against all", "Compare against reference", and "Dereplicate".

The "Decontaminate" mode would be a variant of the "Compare against reference" mode in which matches are removed from the input file.

The typical application for this mode is a set of sequences of a certain group of organisms, in which there are erroneously sequences of other non-target organisms. For example, one has sequenced a lot of frog samples but suspects that some of the sequences come not from the frogs themselves but of some pathogens or parasites of the frogs.

The program would then compare all sequences in the input file with a set of reference sequences, and remove all the matches.

The general procedure would be very similar to the "Compare to reference" mode.

User setting options:

Output files: