Is your feature request related to a problem? Please describe.
Many sequences in the BLAST database are wrongly annotated and show significant discrepancies with other sequences from the same taxon. This leads difficulties with consensus determination and results being annotated at the genus level or higher.
Describe the solution you'd like
Problematic sequences can be identified by careful examination of the results and could be marked as such and be excluded from the results, similarly to what is done for the taxid-blocklist process that is already implemented.
Describe alternatives you've considered
the blastn CLI only allows to exclude taxa OR sequences but not both simultaneously. The easiest implemntation would be to filter sequences ID on the BLAST results.
Additional context
The better solution would be to provide means to curate the database, e.g. with Spec4ID. This is a somewhat more complicated approach and doesn't exclude filtering specific sequences.
Is your feature request related to a problem? Please describe. Many sequences in the BLAST database are wrongly annotated and show significant discrepancies with other sequences from the same taxon. This leads difficulties with consensus determination and results being annotated at the genus level or higher.
Describe the solution you'd like Problematic sequences can be identified by careful examination of the results and could be marked as such and be excluded from the results, similarly to what is done for the taxid-blocklist process that is already implemented.
Describe alternatives you've considered the
blastn
CLI only allows to exclude taxa OR sequences but not both simultaneously. The easiest implemntation would be to filter sequences ID on the BLAST results.Additional context The better solution would be to provide means to curate the database, e.g. with Spec4ID. This is a somewhat more complicated approach and doesn't exclude filtering specific sequences.