Closed korikuzma closed 1 year ago
I think we should build into SeqRepo the ability to quickly report or check compatibility of sequences hosted by a seqrepo instance, which would be a precursor step towards realizing a registry of seqrepo instances.
I think we should build into SeqRepo the ability to quickly report or check compatibility of sequences hosted by a seqrepo instance, which would be a precursor step towards realizing a registry of seqrepo instances.
Alex, can you please explain in more detail? What does it mean to "report or check compatibility of sequences"?
As you said, creating a custom seqrepo is pretty easy. What are some specific goals for a registry? How would they be implemented (or is part of this project to figure that out)?
I'm looking for answers to questions like:
sha512t24:a981407bef983ba3fc54e045
, korikuzma/somesequences
, https://korislaptop:9876/sequenceset1/
.I would like to add support for the Sequence Collections specification, specifically to add support for retrieving and comparing sequence collections.
What I would like to see is a standardized interface for checking if a set of sequences (e.g., sequences collected from VRS objects in a resource) is supported by a seqrepo instance. This is part of the vision of allowing users to load custom sequences into seqrepo and use those sequences to report VRS variants, as we have done for our work with the Atlas for Variant Effects Alliance (AVE).
If that instance (or another instance containing those sequences) is part of a federated network of seqrepo instances, it can report that the sequence collection is retrievable, or notify the user what sequences are not retrievable, in a standardized way.
This will not be worked on at the hackathon. Can be worked on after the hackathon in a new issue in respective repo.
Submitter Name
Alex Wagner (@ahwagner)
Submitter Affiliation
Nationwide Children's Hospital
Requested By
Wagner Lab
Additional Submitter Details
No response
Lead(s)
LEAD NEEDED
biocommons Repo
seqrepo, seqrepo-rest-service
Project Details
A lab member has been inserting his own sequences for MaveDB. Implementing SeqCol allows for a standardized method to check if a collection of sequences used in a dataset (e.g. MaveDB variant maps using VRS) is supported by a seqrepo instance. This has natural implications for use of seqrepo as a federated service.
This would likely be implemented across both
seqrepo
(method to check sequence collection compatibility) andseqrepo-rest-service
(implementing seqcol API spec).Skill Level
Intermediate
Required Skills
Python