broadinstitute / adapt

A package for designing activity-informed nucleic acid diagnostics for viruses.
MIT License
27 stars 1 forks source link

Automatically get reference sequences; sub-species specificity #32

Closed priyappillai closed 3 years ago

priyappillai commented 3 years ago

Automatically get references for sequences from NCBI's database by default; allow reference sequences to be manually supplied using --ref-accs. Changes the default arguments of auto-args to not require a reference sequence if --auto-refs is specified. Also, add sub-species specificity based on metadata using --metadata-filter and --specific-against-metadata-filter. --metadata-filter determines what accessions within a taxa to include; --specific-against-metadata-filter determines what accessions within a taxa to design to be specific against.

priyappillai commented 3 years ago

In an earlier reply (that I can no longer respond to) you wrote:

I modified tax_id to have an extra decimal to indicate what number subtaxa it was (in the order it comes in the file)

Can you expand on that? I don't see where it is.

I got rid of this because it added unnecessary additional complexity!

Given that design.py was already such a mess to begin with, and this PR adds additional complexity, I think I would feel better if you hold off on merging until there are some basic functional/integration tests and this branch passes them.

We discussed on Slack, but just adding this here-made an integration_tests branch and will finish that before merging this PR!

(Also, closing this was an accident)

priyappillai commented 3 years ago

Made edits mentioned, and also removed --auto-refs as an argument (for the reasons listed the reply above). I can add --auto-refs back in if you think that makes sense? Otherwise, just waiting on the integration_tests PR to merge this with master.