ga4gh-beacon / beacon-v2

Unified repository for the GA4GH Beacon v2 API standard
Creative Commons Zero v1.0 Universal
27 stars 22 forks source link

Invalid CollectionsResponse in the OpenAPI #116

Closed redmitry closed 7 months ago

redmitry commented 8 months ago

Hi all,

cohorts/endpoints.json and datasets/endpoints.json have CollectionsResponse response object derfined as a choice of beaconBooleanResponse, beaconCountResponse and beaconCollectionsResponse which is wrong.

Should be defined as beaconCollectionsResponse precising the type according the endpoint cohorts/defaultSchema.json or cohorts/defaultSchema.json.

Best,

Dmitry

mbaudis commented 8 months ago

True. And the "ResultsOKResponse" should also only have this value, right?

components:
  responses:
    ResultsOKResponse:
      description: Successful operation.
      content:
        application/json:
          schema:
            $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconCollectionsResponse.json
    CollectionsResponse:
      description: Successful collection list operation.
      content:
        application/json:
          schema:
            $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconCollectionsResponse.json
redmitry commented 8 months ago

True. And the "ResultsOKResponse" should also only have this value, right?

This would be an improvement because neither "CollectionsResponse" nor "ResultsOKResponse" defines the entryType. The question how to do this in json schema with no pain (not rewriting ALL components) I should think a little bit...

---edit--- The ResultsOKResponse should not return the collection. But it could include concrete entryType for items.

ResultsOKResponse:
    description: Successful operation.
    content:
        application/json:
            schema:
                $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/responses/beaconCollectionsResponse.json
jrambla commented 8 months ago

Let me disagree, a query to a collection could return a boolean or count response, telling he number of collections (datasets or cohorts in the Bv2 Model) that fulfill the query. The question could be: "how many cohorts do you have with men above 50 yo?"

mbaudis commented 8 months ago

@Jordi That is all fine and I agree that this could be a use case; and also that there is IMO a need for having a general response granularity and overall the boolean/count response options.

The response still we have to be fixed from beaconResultsetsResponse to beaconCollectionsResponse etc.

But then in the long run IMO response granularity should be per resultSet to allow especially different granularities in aggregators, but also for Beacons w/ multiple datasets. (AND collections responses could also be resultSets since we're having this construct - e.g. datasets have cohorts. Same for filteringTerms which are per dataset etc - we have to deal with this already!). Just not sure if this would make things more simple (only one response type with different granularities) or more complex ...

mbaudis commented 8 months ago

... so in the short run beaconResultsetsResponse => beaconCollectionsResponse should do the trick, right?

redmitry commented 8 months ago

a query to a collection could return a boolean or count response,

... I see no way for boolean response for "collections" in the beaconCollectionsResponse. The items in the "collections" are entryType(s) beaconCollectionsResponse.json

I also wonder, how could I know if the response is the beaconCollectionsResponse or beaconResultsetsResponse from the configuration/map ?

mbaudis commented 8 months ago

@redmitry @jrambla

... I see no way for boolean response for "collections" in the beaconCollectionsResponse. The items in the "collections" are entryType(s) beaconCollectionsResponse.json

That is true but if you drop the requirement on response like that:

required:
  - meta
  - responseSummary

... you can use boolean or count granularity.

I also wonder, how could I know if the response is the beaconCollectionsResponse or beaconResultsetsResponse from the configuration/map ?

Yes, not possible righty now; only defined in the OpenAPI endpoints.

We actually save such info min our configs:

individual:
  request_entity_path_id: individuals
  response_entity_id: individual
  collection: individuals
  response_schema: beaconResultsetsResponse
  beacon_schema:
    entity_type: individual
    schema: https://progenetix.org/services/schemas/individual/

cohort:
  request_entity_path_id: cohorts
  response_entity_id: cohort
  collection: collations
  response_schema: beaconCollectionsResponse
  beacon_schema:
    entity_type: cohort
    schema: https://progenetix.org/services/schemas/cohort/
  pagination:
    skip: 0
    limit: 10
mbaudis commented 8 months ago

TODO: Rewrite this into 2 issues:

  1. remove requirement for results response element to allow summary responses
  2. determine where/how to define the response schema for entry types (collections... or resultsets)
mbaudis commented 8 months ago

@jordi @redmitry @costero-e Well, I guess issue 1 is different from what I mentioned very excitedly in the discussion & my claim was probably influenced by my re-thinking of the general "how are response types" are being defined (since we'll need responses with mixed boolean/count/record ... granularity).

Here,

remove requirement for results response element to allow summary responses

... would result in a summary response (boolean or count). And the same is true for beaconResultsetsResponse ... etc.

But when using OpenAPI to retrieve the allowed responses the missing option of Boolean ... responses for datasets and cohorts had been fixed at some point already; so issue 1. disappears.

This leaves 2; the only place where response schemas are associated with entry types are the OpenAPI endpoints; there is no definition of associated response types in the model itself.

So for 1. we're back to a simple bug fix for https://github.com/ga4gh-beacon/beacon-v2/blob/75850c3f19816ee05d4fd67d39811f9c5a6a8f29/models/src/beacon-v2-default-model/cohorts/endpoints.yaml#L175 etc.; otherwise good. But having to look at the horribly complex (and therefore potential erroneous - see the link) for figuring out the response schema is ... not nice.

mbaudis commented 7 months ago

I'm retiring this, to be continued in #123. I'll also clean out the PR etc. Was helpful; for the discussions, though ¯_(ツ)_/¯