Closed nsheff closed 2 years ago
So my preference would be the "object-oriented" approach, i.e. option 2 for comparison. I believe this suggestion follows the intent of the original REST paper most closely. However, for that to make sense I believe the retrieve endpoint should start in the same manner (i.e. option 1 under retrieval). However, adding an endpoint 'seqcol' or similar in front of both might be clearer, i.e. a new suggestion /seqcol/digestA/compare/digestB
for comparison and option 2 for retrieval. I also believe recursionLevel should come at the end of the URL path, the lack of which should default to (redirect to?) level 0. However, I'm fine with whatever the plurality wants.
Would it be an idea to add another optional endpoint for lookup of digests at a lower level, e.g. the digest of the lengths
array only? This could support at least two potential operations:
a) Retrieval of the array contents.
Example:
/seqcol/lengths/fa47a1:
[26382827, ..., 4247]
Here: seqcol/sequences/fafa88
could work as/be redirected to a refget endpoint.
b) lookup to retrieve all registered seqcols that contain the array.
Example:
/seqcol/contains/lengths/fa47a1/:
['ac46bb', '74b33d']
This would then also work as a reverse lookup from refget sequence digest to seqcol digest.
So my preference would be the "object-oriented" approach, i.e. option 2 for comparison. I believe this suggestion follows the intent of the original REST paper most closely. However, for that to make sense I believe the retrieve endpoint should start in the same manner (i.e. option 1 under retrieval). However, adding an endpoint 'seqcol' or similar in front of both might be clearer, i.e. a new suggestion /seqcol/digestA/compare/digestB for comparison and option 2 for retrieval. I also believe recursionLevel should come at the end of the URL path, the lack of which should default to (redirect to?) level 0. However, I'm fine with whatever the plurality wants.
I think if we want to follow REST strictly (and I'm not necessarily saying we should), wouldn't it be /comparison/:digestA/:digestB
?
for a GET
request, you're hitting an endpoint that names the noun of the thing you are retrieving. So I would think it would be:
GET /seqcol/:digest
GET /comparison/:digestA/:digestB
That said I don't at the moment have strong feelings that we necessarily need to stick to the REST "best practices" for naming endpoints (or really if such things are universally accepted anyway). To me, endpoints that are verbs (like compare
) doesn't feel problematic, but maybe I just don't have enough experience with API design.
Added proposal as an ADR here: https://github.com/ga4gh/seqcol-spec/pull/24
Edit: what this says is we use:
GET /collection/:digest
GET /comparison/:digestA/:digestB
One question I have is: Should the /collection
endpoint only be allowed to take the primary digest? Or should the 'level 1 digests' (the array-level digests) also be serviced by that endpoint?
One question I have is: Should the
/collection
endpoint only be allowed to take the primary digest? Or should the 'level 1 digests' (the array-level digests) also be serviced by that endpoint?
Any thoughts here, @tcezard @andrewyatz , others?
Apologies didn't see this at first. So /collection
meaning the top level get me this collection information? My gut says this should only take the top-level digest in since the array level digests don't represent collections. Would supporting arrays make the payloads coming back from the endpoint difficult to predict?
I don't think /collection
should accept level 1 digest since these digests do not represent collections most of the time.
A separate endpoint could provide this but I would argue that we should wait and see if the demand exists first.
A possible syntax for level 1 digest resolution endpoints could be e.g. /collections/lengths/<digest>
. This could be added as optional endpoints in the spec. I have a bit of a concern out of the FAIR principles (more specifically principle A1: https://www.go-fair.org/fair-principles/metadata-retrievable-identifier-standardised-communication-protocol/) with the idea of us defining identifiers (digests) for the arrays at level 1 without providing a standardised way to resolve such identifiers.
Would supporting arrays make the payloads coming back from the endpoint difficult to predict?
Yes, this is a good point. it means the server would return things of different structure, depending on what the input was.
A possible syntax for level 1 digest resolution endpoints could be e.g. /collections/lengths/
Rather than /collection/lengths/<digest>
, maybe it would be /array/<digest>
, which is then universal for all level 1 digests? This seems to fit better with the current naming scheme
The decision on endpoint structure was decided and documented in an ADR in PR #24
Spinoff from #21.
How should the API user interface be designed?
1. Compare endpoint
We decided the GET and POST should both use
/compare
somehow. Should it be:/compare/:digestA/:digestB
, or/:digestA/compare/:digestB
The first seems to envision
compare
as a standalone function which takes 2 values. The second seems to envisioncompare
as an operation that can be done on a particular seqcol, with a second one as the function parameter.2. Sequence collection retrieval endpoints
Along these lines, how should the retrieval endpoints work? What is the name of the endpoint?
/:digest/:recursionLevel
(e.g./a6748aa0f6a1e165f871dbed5e54ba62/1
for level 1)./seqcol/:digest/:recursionLevel
/retrieve/:digest/:recursionLevel
,/retrieve/:recursionLevel/:digest
,Are there any other endpoints that should be provided?