Open diekhans opened 8 years ago
@mbaudis @ejacox @kozbo It would be great to land on one side or the other here before release. Just use reference names seems like it would make people the happiest at this point. Ideas?
Having a different way of specifying position from one protocol to the next is really confusing. To close this we might remove the alternative reference_id
pattern from SearchVariantAnnotations as well.
Does this fit within our external ids discussion. The reference_name is the identifier used within a particular reference (hg38?). In that view, reference_name is fine if we indicate somewhere what the reference is.
@ejacox if there is a way to elegantly solve it using that mechanism, I'm not against it. My hope is that we can treat this problem simply as using the same position searches throughout the API without needing to do much more modeling. Reference ID and name are used inconsistently, it still might be incorrect to use one or the other, but let's choose one approach.
@dcolligan @delagoya ? opinions? We can close this by removing reference ID from SearchReadsRequest and SearchVariantAnnotationsRequest, which seems to align with @richarddurbin's comments here: https://github.com/ga4gh/schemas/pull/616. We would then also merge language like https://github.com/ga4gh/schemas/pull/732 to enforce that references are uniquely named in a reference set.
I believe it makes it easiest to work with some genomics data if only the reference name is required for search, since you technically don't have to have a reference set local to your instance. 1
, 2
, 3
, are fairly portable.
If we choose IDs we should implement searching references by name https://github.com/ga4gh/schemas/pull/665. It is nice to have in either case.
Note, 23andMe opted to use accession_id when specifying a reference for range searches https://api.23andme.com/docs/reference/, which falls somewhere between using reference names and server generated identifiers.
The design pattern in the API is that object linkage is done by id, not by name.
The Position object uses referenceName rather than referenceId.
This should be corrected or the rationale for this inconsistenty strongly documented