ga4gh-discovery / ga4gh-service-info

GA4GH service-info specification.
Apache License 2.0
9 stars 8 forks source link

Explain how to extend the specification by other specifications #35

Closed mcupak closed 5 years ago

mcupak commented 5 years ago

E.g. Beacon wants to have an image URL for organizations.

mcupak commented 5 years ago

The consensus seems to be to encourage a single top-level extension object for everything, incl. organizationImageUrl.

andrewyatz commented 5 years ago

Just tried this in refget as I've never done inheritance using OpenAPI before. Works really well!

    RefgetServiceInfo:
      allOf:
      - '$ref': https://raw.githubusercontent.com/ga4gh-discovery/ga4gh-service-info/develop/service-info.yaml#/components/schemas/ServiceInfo
      - type: object
        description: Extended capabilities of refget service info
        properties:
          extension:
            type: object
            required:
              - circular_supported
              - algorithms
            properties:
              circular_supported:
                type: boolean
                description: Indicates if the service supports circular location queries
              subsequence_limit:
                type: integer
                format: int64
                nullable: true
                description: Maximum length of sequence which may be requested using start and/or end query parameters
              algorithms:
                type: array
                items:
                  type: string
                  description: An array of strings listing the digest algorithms that are supported
                  example:
                  - md5
                  - trunc512

However the example response looks like

{
  "id": "org.ga4gh.service",
  "name": "1000 Genomes Project",
  "type": "org.ga4gh:beacon:1.0.1",
  "description": "The 1000 Genomes Project provides a large public catalogue of\nhuman genome variation data which serves as standard reference\nin many genomic workflows and analysis projects.\n",
  "documentationUrl": "https://docs.example.com",
  "organization": "EMBL-EBI",
  "contactUrl": "mailto:support@example.com",
  "version": "1.0.0",
  "extension": {
    "circular_supported": true,
    "subsequence_limit": 0,
    "algorithms": [
      [
        "md5",
        "trunc512"
      ]
    ]
  },
  "createdAt": "2019-06-04T12:58:19Z",
  "updatedAt": "2019-06-04T12:58:19Z"
}

We should make the OpenAPI more generic in the examples so it doesn't look this odd when we integrate it into 3rd party applications

andrewyatz commented 5 years ago

@rishidev I think we need an entry for service info's OpenAPI spec at https://github.com/ga4gh/w3id.org/blob/master/ga4gh/.htaccess to help this 3rd party integration

andrewyatz commented 5 years ago

See https://github.com/samtools/hts-specs/compare/master...andrewyatz:feature/serviceinfosupport as an example of this

mcupak commented 5 years ago

This is good, and works well in the service-registry too.

The problem we have is that we need to override the whole extension field rather than just add fields into it. This isn't just a theoretical issue. For example, we know for a fact that the Beacon specification needs to add custom fields, and Beacon implementers often add custom fields on top of the Beacon spec. As per #10, the Beacon spec needs to put everything in extension. The downstream spec for the implementation (or other dependent spec, such as the Beacon Network) can't just add more things, but needs to overrride the whole extension.

If we allowed arbitrary extending, rather than requiring to put everything in particular field, this would not be an issue. As a non-trivial bonus, this would also allow extension of nested fields, such as organization.

I think this justifies changing how we extend things, but I'd like to discuss on the call today.

There were 3 reasons we decided to enforce a particular field in the first place:

  1. Consistent versioning with respect to backward compatibility. While we can guarantee backward compatibility with both options at the level of the standard (if we're only bumping a minor version, we'd only add fields, not change existing fields), a minor version bump in the spec might force a major version update in an implementation (adding a field someone is already using in a different way).
  2. Easier to distinguish at first side what's a standard field vs. a proprietary extension.
  3. We know the tooling works well with this option.

We've previously explored 3 and found out this is a non-issue - the tooling works well with the alternative approach as well.

2 is subjective, and I'd argue not too important. From consumer's perspective, I don't really care if the field is standard, and if I want to know, I can always check the spec. The service API tells me the version of the spec it's implementing.

As for 1, the service now tells us its version, as well as the version of the spec. There's no reason why these should be versioned in sync, in fact, it's very likely you'll want to update the API of the service a lot more frequently than the spec.

I've looked for some similar situations elsewhere and found a few that support not going with a separate field. For example, HTTP headers have similar standardization and extension requirements. Originally, a decision was made to encourage headers prefixed with X-, but this is now an antipattern for very similar reasons we've debated here. JWT and Auth0 uses the same approach for claims (although they do have a registry for custom claims and namespace custom claims to avoid collisions).

mcupak commented 5 years ago

+1 from Andy

andrewyatz commented 5 years ago

Need to retire the extension field from the specification

andrewyatz commented 5 years ago

50 removes the extension field as agreed upon here

rishidev commented 5 years ago

@andrewyatz Feel free to make a PR on w3id.org if needed

andrewyatz commented 5 years ago

@rishidev this has been created. @mcupak take a look at https://github.com/ga4gh/w3id.org/pull/2. The url will be https://w3id.org/ga4gh/discovery/service-info-v1.yaml and currently points to what is on the develop branch (we can switch this later once through PRC).

mcupak commented 5 years ago

@andrewyatz A few questions:

andrewyatz commented 5 years ago

Quick responses

mcupak commented 5 years ago

Thanks @andrewyatz, makes sense. I filed https://github.com/ga4gh-discovery/ga4gh-service-registry/issues/88 for the registry.