Open rvosa opened 9 years ago
I think there's a number of different issues that come into play here, none of which are superimportant but they can lead to nags from users and reviewers:
smrt taxize
so that only names that look like decent binomials are kept.On the other hand, we can also say that users need to exercise due diligence and inspect their species.tsv
and their alignments.
It would be a good enhancement if it was possible to filter sequences on certain annotation values that users might take as indicators of sequence quality. For example, the following sequence has an annotation that gives the identifier of the voucher specimen, which for systematists is useful information in assessing whether the taxonomic identification is actually credible: http://www.ncbi.nlm.nih.gov/nuccore/JQ922076.1 (on line 5 of the features table).
This would require that we actually populate and use the features table (https://github.com/naturalis/supersmart/blob/master/lib/Bio/Phylo/PhyLoTA/DAO/Result/Feature.pm), where the sequence should have a
primary_tag
whose value isspecimen_voucher
.