naturalis / supersmart

Self-Updating Platform for the Estimation of Rates of Speciation, Migration And Relationships of Taxa
MIT License
17 stars 5 forks source link

Filter on certain annotation values #23

Open rvosa opened 9 years ago

rvosa commented 9 years ago

It would be a good enhancement if it was possible to filter sequences on certain annotation values that users might take as indicators of sequence quality. For example, the following sequence has an annotation that gives the identifier of the voucher specimen, which for systematists is useful information in assessing whether the taxonomic identification is actually credible: http://www.ncbi.nlm.nih.gov/nuccore/JQ922076.1 (on line 5 of the features table).

This would require that we actually populate and use the features table (https://github.com/naturalis/supersmart/blob/master/lib/Bio/Phylo/PhyLoTA/DAO/Result/Feature.pm), where the sequence should have a primary_tag whose value is specimen_voucher.

rvosa commented 9 years ago

I think there's a number of different issues that come into play here, none of which are superimportant but they can lead to nags from users and reviewers:

On the other hand, we can also say that users need to exercise due diligence and inspect their species.tsv and their alignments.