ireceptor-plus / issues

0 stars 0 forks source link

Transition from v1.3 to v1.4 of AIRR schema #101

Closed schristley closed 1 year ago

schristley commented 2 years ago

We need to come up with a timeline and process for upgrading to the new schema version.

bcorrie commented 2 years ago

Based on CRWG discussions, I think we probably want a /airr/v2 type of entry point - since the changes are not backwards compatible. I think it would be nice if the API entry point tracked the AIRR Standard version - is that crazy? 8-) Should we have /airr/v1.4

We should check to make sure that we are following the AIRR Community approach on version releases. Because the proposed v1.4 is not backwards compatible release, I think we are allowed to bump the minor version - @bussec any thoughts on that?

Somewhat dated docs on this process are: https://github.com/airr-community/airr-standards/blob/master/CONTRIBUTING.rst

bcorrie commented 2 years ago

@schristley I think we are targeting June for the /airr/v1.4 release for our repositories. At that time we should have update scripts to transition iReceptor Turnkey repositories from v1.3 spec to v1.4 spec. The main non backwards compatible changes I believe are the ontology fields.

We will provide this to our users (sciReptor, NICD, UKM) but we don't really have a lot of control with them as to when they upgrade. We will have to reach out to them and discuss this.

The more I think about this, the more I think the biggest challenge is for the user/client of the repositories. I am thinking that the iReceptor Gateway might need to support two minor release versions at a time. If the current version is v1.4, we support v1.4 and v1.3, but not anything older. Similarly for a move to v1.5 we only support v1.5 and v1.4.

What do you think about that from a VDJServer perspective??? I haven't talked to @jeromejaglale about this yet 8-)

Definitely would mean some special case handling but hard to see anything else being practical in an ADC where you don't control the repositories! We still have some "control" since they are all our friends, but in general that would not be the case. If we don't do something like that, you would need to have everything switch at once (not practical) or have the Gateway only support one version and have the ADC "shrink" dramatically from a Gateway perspective when a switch occurs (not desirable).

schristley commented 2 years ago

I think it would be nice if the API entry point tracked the AIRR Standard version - is that crazy? 8-) Should we have /airr/v1.4

Hmm, interesting idea, but my initial response is that I'm not a fan of this. Technically, the API doesn't have to have a new version or change just because the schema changes. The Info block currently lets the client know what schema version is provided. And it really doesn't solve the underlying issue that clients need to decide how to handle different schema versions, it just seems to add more endpoints that repositories have to manage. I suppose it does allow repositories to support multiple schema version at the same time, but do repositories really want to do this?

I personally like the idea that the API versioning is strictly about the API itself, i.e., the end points and the parameters. If those don't change, then there is no need for the API version to change. But I suppose it's an alternative viewpoint that the API versioning is expansive and covers all data changes as well. I just don't see how that helps solve the main issue for clients. In my mind, the code if (api.version == "1.4") is essentially equivalent to GET /airr/v1.4

Now a true alternative would be that instead of requiring clients to handle different schema versions, clients tell the API (through a parameter) what schema version they want. This shifts the onus onto the repositories to be able to provide multiple schemas version, then in theory clients don't have to change. I say "in theory" though because I think practically repositories would have a hard time doing this.

schristley commented 2 years ago

We should check to make sure that we are following the AIRR Community approach on version releases. Because the proposed v1.4 is not backwards compatible release, I think we are allowed to bump the minor version - @bussec any thoughts on that?

Somewhat dated docs on this process are: https://github.com/airr-community/airr-standards/blob/master/CONTRIBUTING.rst

I don't think CRWG ever formally decided on a versioning scheme. What is documented above was specifically designed for the schema. I'm of the personal opinion that the API should be on it's own versioning track, it provides more control and flexibility. For reference, it avoids the "cleanup" we had to do for the v1.3 schema in which we tried to make API v1.0 ready at exactly the same time. It didn't work out and we needed to "fix" github and regenerate tags...

bcorrie commented 2 years ago

Good point... So this is mostly an AIRR CRWG question - does the 1.4 schema require an API /airr/vX change?

We have new endpoints so that might indicate yes.

We have new fields in the spec, but the actual structure of the API query is the same. At the same time, we have some adc_ specific query parameters that are new that are outside the AIRR spec. So I think that would indicate yes as well???

schristley commented 2 years ago

What do you think about that from a VDJServer perspective??? I haven't talked to @jeromejaglale about this yet 8-)

@bcorrie From a repository perspective, supporting multiple schema versions feels like a nightmare to me. I think that essentially means I'd need two copies of the data, then I have to write all the code infrastructure to support updating both of those versions, that flows through the whole stack, ugh... The preference is clearly to do the conversion once and then only support the new version.

However, I do have separate production and staging databases, so I see doing the conversion on staging, that allows clients to test, but at that point there's no more updates to production or the old version. Once things are good, flip the switch to make staging become production. This is essentially the same process that I do today for new db releases.

schristley commented 2 years ago

Good point... So this is mostly an AIRR CRWG question - does the 1.4 schema require an API /airr/vX change?

We have new endpoints so that might indicate yes.

We have new fields in the spec, but the actual structure of the API query is the same. At the same time, we have some adc_ specific query parameters that are new that are outside the AIRR spec. So I think that would indicate yes as well???

Don't get me wrong. I think its a valid viewpoint that if the data schema changes, that could be considered a fundamental change to the API. For example, this line:

        $ref: 'https://raw.githubusercontent.com/airr-community/airr-standards/v1.3/specs/airr-schema-openapi3.yaml#/Repertoire'

needs to be changed to this:

        $ref: 'https://raw.githubusercontent.com/airr-community/airr-standards/v1.4/specs/airr-schema-openapi3.yaml#/Repertoire'

so you can say, technically, the spec file has changed thus it requires a new version.

I think I'm also fine with that, it kinda makes sense that after a new schema version is release, then a new "point release" (1.1, 1.2, etc) of the API is released (so that other tools can reference a specific git tag). I'm just not sure if that requires changing the end points...

schristley commented 2 years ago

The more I think about this, the more I think the biggest challenge is for the user/client of the repositories. I am thinking that the iReceptor Gateway might need to support two minor release versions at a time. If the current version is v1.4, we support v1.4 and v1.3, but not anything older. Similarly for a move to v1.5 we only support v1.5 and v1.4.

I agree. I don't see how to avoid clients having to deal with schema changes. It's important to note that all the new AIRR objects are just experimental (with additional quirks like clones don't support single cell yet), so there are undoubtedly gonna be more changes, likely for years to come...

bcorrie commented 2 years ago

We discussed this, and from the iReceptor Gateway perspective, we are intending on supporting two API versions at a time. It is hard to imagine avoiding that, and we think it is fairly manageable on the Gateway. At the same time, we would no longer support any API entry point that was two versions away from the current version (we would support at most one old version).

I think that is reasonable and gets us through transitions as the versions/standards change.

schristley commented 2 years ago

I think that is reasonable and gets us through transitions as the versions/standards change.

Sounds good. I'll likely do the same with VDJServer. There's a bit more involved with VDJServer beyond the ADC as also have to transition the data entry screens, the tapis apps, and visualizations.

schristley commented 1 year ago

all done