Materials-Consortia / OPTIMADE

Specification of a common REST API for access to materials databases
https://optimade.org/specification
Creative Commons Attribution 4.0 International
82 stars 37 forks source link

Standardizing serving the API on the unversioned URL #277

Closed rartino closed 4 years ago

rartino commented 4 years ago

This was brought up at the end of the OPTIMADE 2020 workshop:

Right now the specification does not forbid, but also does not in any way endorse, serving the the API on the unversioned base URL.

Do we want to standardize how the API is served on the unversioned base url URL?

If we cannot agree, then the "default" is that we release as v1.0.0 the specification that neither forbids, nor endorses, serving something at the unversioned URL.

This question also ties into what the proper procedure for version negotiation between a client and server is. Presently the specification does prescribe the following procedure:

For implementers: Clients are recommended to discover the highest version supported by both the client and the API implementation by trying versioned base URLs in order of priority. E.g., if major version 2 and lower are supported by the client, it would try: /v2, /v1, and then /v0.

sauliusg commented 4 years ago

If you think I exaggerate the wordiness here compared to what I wrote for the versioned endpoint, feel free to show how you intend to formulate this in your own papers.

Well, just copy exactly the same text as for your versioned endpoints and you are done.

sauliusg commented 4 years ago

No, the worst case is that you get a slightly altered behavior that leaves you thinking that the query worked as intended, but with e.g., entries missing.

You may interject "but we will never change the behavior like that!". Sure, we can try to avoid it. But it is not always super obvious how interconnected features interact when they are changed in backwards-breaking ways (e.g, the structure of fields), and something like this could slip through the cracks unintentionally. It will be your fight, every time we evolve the API to a new major version, to make sure to avoid any such subtle breakage for the examples published in your papers.

It will not be my fight since I am not going to make the promises you are implying.

Now, to cite REST discussions on the Net and R. Fielding, "The key abstraction of information in REST is a resource. Any information that can be named can be a resource:" (emphasis mine). So it seems that we have a resource, /structures, and we have a name (URI) for it https://example.com/optimade/structures. Version numbers are not in the game. In the (https://restfulapi.net/resource-naming) examples, no version numbers are included. This also corresponds to my experience and my expectations. Well, maybe the guys did not think that far ahead, but still https://example.com/optimade/structures seems to be a standard REST name, and I expect (probably everyone else does) it to be queryable.

We can include versioned endpoints as means to increase reliability in the future (they were criticised in one of the blogs you have cited, but the criticism was based on philosophy, so we can happily ignore it). But still unversioned endpoints seem to be primary names, and versioned ones are implementation-dependent synonyms.

With the /versions endpoint, we have means to determine unambiguously what version of the API it serves. Thus it is possible to implement the automatic reliable query client. The disclaimer that some QS may not work is there. IMHO, this is enough to let the unversioned enpoints go as a 'MAY' feature.

So:

  1. /optimade/versions is a MUST;
  2. /optimade/v1/info is a MUST iff /v1 is supported, otherwise it MAY return standard HTTP 404;
  3. /optimade/info is a MAY and if it exists, it MUST serve the top-most version listed on /optimade/versions;
  4. We understand the caveats of using both versioned and unversioned endpoints and document them as widely as possible;
  5. Algorithms of reliable querying OPTIMADE in the presence of arbitrarily incompatible future major changes exist, provided /versions stays stable; we document them as widely as possible.

Other endpoints exists along the same lines. To me it seems reasonable.

Do you agree with this policy?

rartino commented 4 years ago

Yes, but this means that I need to bother about all previously existed versions (what you suggest is even worse – I need to bother about version which I never implemented, if we want to make the behaviour consistent).

if (not requested_version in supported_versions) {
  return http_response(551, "Version not supported", "The requested version of the OPTIMADE API is not supported by this implementation. The requested version was "+requested version+". The available versions are: "+supported_versions+". Please refer to https://www.optimade.org/ to find information about how to update your query to a more recent version.")
}

Done!

In the text – where? In the documentation?

Right in the text I proposed to put in a paper:

You can make this query using version 1 of the OPTIMADE API as: https://example.com/optimade/v1/structures?filter=elements HAS Na AND elements HAS Cl AND nelements=2 . If version 1 is no longer supported by the database, refer to https://optimade.org for up to date information on how to formulate the query in a more recent version.

What prevents you from putting a similar disclaimer next to unversioned endpoint specs, with exactly the same effect? ... Well, just copy exactly the same text as for your versioned endpoints and you are done.

So, to be absolutely clear, do I understand correctly that this is what you will put in your papers?:

You can make this query using version 1 of the OPTIMADE API as: https://example.com/optimade/structures?filter=elements HAS Na AND elements HAS Cl AND nelements=2 . If version 1 is no longer supported by the database, refer to https://optimade.org for up to date information on how to formulate the query in a more recent version.

I mean, to each their own. But, I would be confused by your paper if I was new to the OPTIMADE API and tried this query after you had upgraded your server to version 2 on that exact URL.

 I'd say this is a major advantage of having static examples point to versioned endpoints over trying to handle old requests on the unversioned endpoint, where you have no way to tell what version of the API a client is trying to interact with.

So add &api=v1 to the QS.

Or, even better, insert /v1/ in the URL, since that is what we have already standardized on to provide the functionality you seek.

It will be your fight, every time we evolve the API to a new major version, to make sure to avoid any such subtle breakage for the examples published in your papers.

This can only be solved with citable queries that have unique resource IDs, as mentioned above.

This goes a bit off-track in my opinion, but alright. I've read your text about reproducible queries. I don't understand the use for the UUID:ed query representations. We can discuss the various features that can be provided for old queries (results on old/new data returned in old/new representation) in a moment, but the first step is to agree on a stable representation of old queries. Why not just use this?:

v1/structures?filter=elements HAS Na AND elements HAS Cl AND nelements=2

This will always have a very precise meaning, as specifies by the historic OPTIMADE API specification for version 1. If your implementation drops support for v1 you can still use this as a stable abstract string identifier for that historic query, and you can use it the same way as you propose to use UUIDs for cache and translations.

Now, lets think about the feature of providing access to old data for this old query, I would do it today as:

v1/structures?filter=elements HAS Na AND elements HAS Cl AND nelements=2&_exmpl_historic_timestamp=2020-06-17T14:30:20Z

This way I don't have to cache the old response if my database handles snapshots. If it does not, then I can still cache the response and use the above string representation to access that cache.

It requires a bit more thought how to provide a good interface for the functionality of requesting these structures in the latest response format. Perhaps like this?:

v1/structures?filter=elements HAS Na AND elements HAS Cl AND nelements=2&_exmpl_historic_timestamp=2020-06-17T14:30:20Z&response_format=_exmpl_json_v2

I don't quite see the issues here. v1/structures?filer=... is a perfectly fine static representation of an historic query for me.

sauliusg commented 4 years ago

What are you trying to achieve that isn't covered by the versioned URL?: https://example.com/optimade/v1/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2

No it is not.

That it looks more like a static resource?

Yes, an this is why it is bad. I have to either commit to server /v1 indefinitely or to accept the 'link rot'.

You still need to figure out what to do if the server doesn't serve the version requested.

Yes, but on an unversioned resource with the &api=v1 request server can at least attempt to help you get the understandable response, while the static /optimade/v1/ is simply gone.

What I mean is: &api=v1 is not an obligation of the server but rather a hint from the client what his preferences are. The server SHOULD return v1 response if it can, but of it can not it MAY return the closes matching API version (e.g. v2, even if v10 is the default`).

I now realise that I am reinventing content negotiation. However, my main concern is that I want to increase longevity of links with QS included; https://example.com/optimade/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2&api=1 has better chance to be served and understood correctly than both

https://example.com/optimade/v1/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2

or

https://example.com/optimade/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2

rartino commented 4 years ago

[...] Do you agree with this policy?

This policy looks as what I suggest to implement in #290. So, yes, I reluctantly agree to accept this policy.

sauliusg commented 4 years ago

[...] Do you agree with this policy?

This policy looks as what I suggest to implement in #290. So, yes, I reluctantly agree to accept this policy.

OK. I also reluctantly accept it :P

sauliusg commented 4 years ago

But, I would be confused by your paper if I was new to the OPTIMADE API and tried this query after you had upgraded your server to version 2 on that exact URL.

Well, the spec says the endpoint serves the newest version which you can find in /versions and you are still confused? How do we make it more clear than this?

I am not a fan of making restrictions and forbid people doing certain things only on the theoretical assumption that they can do something wrong (or even less than that, something that you did not intend). Give OPTIMADE the Internet and Unix freedom (including the freedom to do wrong things)!

sauliusg commented 4 years ago

Yes, but this means that I need to bother about all previously existed versions (what you suggest is even worse – I need to bother about version which I never implemented, if we want to make the behaviour consistent).

if (not requested_version in supported_versions) {
  return http_response(551, "Version not supported", "The requested version of the OPTIMADE API is not supported by this implementation. The requested version was "+requested version+". The available versions are: "+supported_versions+". Please refer to https://www.optimade.org/ to find information about how to update your query to a more recent version.")
}

Done!

On the unversioned endpoint with &api=v1, yes. On the versioned endpoint, it is still 404 since nobody bothers to support it. Remember, my assumption is that /v1 is no longer supported bu the server. That means, nobody maintains redirects, URIs, etc.. Versioned URIs are not cool. Unversioned URIs are :)

rartino commented 4 years ago

 I now realise that I am reinventing content negotiation. However, my main concern is that I want to increase longevity of links with QS included

You are arriving at the same conclusions as the link I listed earlier: https://www.troyhunt.com/your-api-versioning-is-wrong-which-is/ , which is that there are merits to all three version negotiation protocols. If we keep this discussion going, we'll eventually arrive at also wanting to add an option for clients to provide a version hint as a header.

If you really want to, I am OK with standardizing a &api=v1 query parameter. But all I see here is us moving around the version hint the client sends in the URL which, in my implementation, makes very little practical difference.

However, I'm not stoked about "the server gets to decide to return a different version than you ask for if it thinks that is ok", for precisely the same reasons we have discussed at length for requests with no version provided. I'd prefer suggesting (MAY or SHOULD) to serve an error of type: "Requested version X not supported. You can try to resubmit the same query to the closest supported available version Y by following this link or refer to https://www.optimade.org for documentation about the difference between the different versions of the API."

rartino commented 4 years ago

On the unversioned endpoint with &api=v1, yes. On the versioned endpoint, it is still 404 since nobody bothers to support it. Remember, my assumption is that /v1 is no longer supported bu the server. That means, nobody maintains redirects, URIs, etc.. Versioned URIs are not cool. Unversioned URIs are :)

I think you missed the point with my pseudocode. It was meant to illustrate how one would implement returning a 551 Version not supported (not a 404 Not found!) with a user-friendly error message for all attempts at accessing any /v<integer> URL for a version that isn't supported. You don't need to "remember" to do anything specific for v1, all attempts at v<integer> where <integer> isn't recognized are handled exactly the same.

rartino commented 4 years ago

But, I would be confused by your paper if I was new to the OPTIMADE API and tried this query after you had upgraded your server to version 2 on that exact URL.

Well, the spec says the endpoint serves the newest version which you can find in /versions and you are still confused? How do we make it more clear than this?

If your paper text says that the URL you provide allows me to query version 1 of the OPTIMADE API, while your server responds according to version 2 on that exact URL, I reserve the right to be confused. That is why I suggested a longer text was necessary.

rartino commented 4 years ago

You still need to figure out what to do if the server doesn't serve the version requested.

Yes, but on an unversioned resource with the &api=v1 request server can at least attempt to help you get the understandable response, while the static /optimade/v1/ is simply gone.

Alright, after thinking this over, I don't see this parameter causing any trouble if we make it completely optional, and name it something that expresses its weaker status, e.g., api_version_hint. And, we say that the parameter MAY be used for requests on unversioned endpoints to be interpreted at the discretion of the implementation to help the decision for how to interpret the request and in what version format to respond. The mandate to only serve the first version in /versions is lifted to only apply if no api_version_hint is provided.

Is this acceptable?

(It would be even better if we would forbid requests on unversioned endpoints without api_version_hint for anything else than single entry resource objects, but I guess you won't like that.)

sauliusg commented 4 years ago

You still need to figure out what to do if the server doesn't serve the version requested.

Yes, but on an unversioned resource with the &api=v1 request server can at least attempt to help you get the understandable response, while the static /optimade/v1/ is simply gone.

Alright, after thinking this over, I don't see this parameter causing any trouble if we make it completely optional, and name it something that expresses its weaker status, e.g., api_version_hint. And, we say that the parameter MAY be used for requests on unversioned endpoints to be interpreted at the discretion of the implementation to help the decision for how to interpret the request and in what version format to respond. The mandate to only serve the first version in /versions is lifted to only apply if no api_version_hint is provided.

Is this acceptable?

This looks OK to me, except that api_version_hint is quite long, api is shorter (an we know what it means anyway, don't we?).

But do we need this right now? Aren't we overloading OPTIMADE with secondary (possibly "nice-to-have", but unessential) features which we could live without?

On the REST in general

I'm also thinking on the issue, and in reading through https://restfulapi.net/resource-naming/ and several other blogs I am amused that nobody is even mentioning API versioning there!

So I think what is the logic those people follow? Understand it like this:

  1. A REST endpoint, e.g. http://api.example.com/device-management/managed-devices/{id}, http://api.example.com/device-management/managed-devices, https://example.com/optimade/structures, https://example.com/optimade/structures/2200000 names a resource; what is returned is (supposed to be) content-negotiated and will return the latest resource in the latest API version representation;

I thing we agree on this.

  1. If an endpoint (the resource) ceases to exist, HTTP 404 MAY be returned; the server can chose more informative codes, such as 301 Moved Permanently or 551 Version not supported or any other appropriate code; it seems that we agree on this as well

  2. What about requests with QS? E.g. http://api.example.com/device-management/managed-devices?region=USA&brand=XYZ, or https://example.com/optimade/structures?filter=nelements=2

The naming discussion has very strange (from our standpoint) comment on what is an API: "For this, do not create new APIs – rather enable sorting, filtering and pagination capabilities in resource collection API and pass the input parameters as query parameters." ??? Seems that the author of that blog did not consider QS as a part of the API at all! But regardless, what does he/she/we expect from the requests mentioned in the item 3 when API version changes and brand=... is not supported?

It seems to me that many people writing on the REST tacitly assume that all unknown QS parameters are simply ignored (although the text just cited is silent on this). This is seriously against what we assume in OPTIMADE, where errors in parameters are reported as errors (in many cases under a 'MUST' clause).

What are the consequences of both of these choices? In the 'naive' REST that simply ignores QS parameters that it does not understand, a query:

https://example.com/optimade/structures?filter=nelements=2 is equivalent to https://example.com/optimade/structures if filter is no longer supported in the new version. This will return more data than requested. Also, the database version might have changed in between... But still the https://example.com/optimade/structures endpoint is valid and supported, even if 10 major API upgrades happened.

I agree that getting all structures instead of just selected structures is not admissible for scientific queries, where the selected subset is essential for reproducible computation.

So if 'filter=' element is no longer supported, we return an error. Bad luck for compatibility.

But lets imagine in v10 filter is vastly expanded, but the filter=nelements=2 is still supported and means the same. The query works as before, just gives response in v10 format. Sounds OK? The link survives major API changes!

Another strategy would be:

  1. Don't promise anything regarding QS support in the future;

  2. When a query is done, we return as a part of response a https://example.com/optimade/query/{id} URI, or maybe even redirect to it, and this becomes a citable future-proof query.

But (2) is tough to implement, so not for the near future... In the near future we can only

  1. try to keep the filter=... strings stable.

I see (2) and (3) as the only possible alternatives, and (3) is way more simple.

(It would be even better if we would forbid requests on unversioned endpoints without api_version_hint for anything else than single entry resource objects, but I guess you won't like that.)

Yes, your guess is absolutely right, I would be quite unhappy with the 'MUST' on the version parameter :)

We need to have sensible defaults, and api_version_hint is not even the most important parameter!

rartino commented 4 years ago

I don't see this parameter causing any trouble if we make it completely optional, and name it something that expresses its weaker status, e.g., api_version_hint. And, we say that the parameter MAY be used for requests on unversioned endpoints to be interpreted at the discretion of the implementation to help the decision for how to interpret the request and in what version format to respond. The mandate to only serve the first version in /versions is lifted to only apply if no api_version_hint is provided. Is this acceptable?

This looks OK to me, except that api_version_hint is quite long, api is shorter (an we know what it means anyway, don't we?).

A query parameter for the version hint needs an underscore to adhere to JSON API v1.0. And, I wanted a name that highlights that this parameter is not as firm as these version selection parameters usually are. With api=v1 it would be reasonable to expect to either get v1 or an error. Perhaps api_hint=v1?

But do we need this right now? Aren't we overloading OPTIMADE with secondary (possibly "nice-to-have", but unessential) features which we could live without?

Well, If we are going to have a parameter like this, now is the time to define it.

As soon as v1 is out, and you publish a URL query example in a paper without such a parameter, i.e, as: https://example.com/optimade/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2 it is, in a sense, too late. After that point, you are going to argue that we cannot alter the meaning of that query in a future version, because that would "break" your paper.

Hence, if you say that you are going to use this parameter from day one if we define it in the specification, and thus your first publication will rather have: https://example.com/optimade/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2&api_hint=v1 then I would happily add the few extra sentences for defining api_hint in the spec.

However, if you rather say "meh, I probably won't bother for now", then it has no point. I won't use that parameter, because I am rather going to publish: https://example.com/optimade/v1/structures?filter=elements+HAS+"Na"+AND+elements+HAS+"Cl"+AND+nelements=2 because I am fully satisfied with how that handles all forward-compatibility issues. With the features allowed by #290, once I drop v1, users will be given a sensible error message that helps them adjust to a supported version. That is sufficient for me.

I've read your text on versioning in REST in general and agree up until this point:

But lets imagine in v10 filter is vastly expanded, but the filter=nelements=2 is still supported and means the same. The query works as before, just gives response in v10 format. Sounds OK? The link survives major API changes!

This sounds good in theory. But in practice, the promise that your interaction with the API stays stable is made firmly only within the minor releases, so that is where you are safe to do this assumption. It may look simple to commit to not breaking old filter= strings, but when the API evolves, we may find ourselves adding more functionality (selection of output entries, "join"-type queries, ...) and across all those features evolving over time, you cannot expect the precise behavior of an old QS to never change.

So, another model is to just make access to single resource objects stable over all versions of the API. URLs with type and id like: https://api.example.com/{type}/{id} gives stable URLs to the resource objects. For all other interactions the client is required to give a version. If the server supports that version, great. If not, give an error message and be as helpful as possible in telling the user how to upgrade their QS.

We can discuss things as "translating" an old QS into a new one. But in practice, all that means is that you are retaining support for (at least part of) the older version. So, just serve queries under the old version if you can, and return errors if you can't. If that support of older versions is handled via an internal translator, that is just an implementation detail.

  1. When a query is done, we return as a part of response a https://example.com/optimade/query/{id} URI, or maybe even redirect to it, and this becomes a citable future-proof query.

Again, I don't see why this is necessary. Why can't that {id} just be the query itself (including the QS)? If the problem is "it may be too long", just use a hashing function.

sauliusg commented 4 years ago
  1. When a query is done, we return as a part of response a https://example.com/optimade/query/{id} URI, or maybe even redirect to it, and this becomes a citable future-proof query.

Again, I don't see why this is necessary. Why can't that {id} just be the query itself (including the QS)? If the problem is "it may be too long", just use a hashing function.

Well, the query itself (including the QS) is not enough because, as you have said also in your post, filter language may change in the future. If we agree that QS is not a part of the name then it should not be used to identify a resource.

sauliusg commented 4 years ago

A query parameter for the version hint needs an underscore to adhere to JSON API v1.0. And, I wanted a name that highlights that this parameter is not as firm as these version selection parameters usually are. With api=v1 it would be reasonable to expect to either get v1 or an error. Perhaps api_hint=v1?

But do we need this right now? Aren't we overloading OPTIMADE with secondary (possibly "nice-to-have", but unessential) features which we could live without?

Well, If we are going to have a parameter like this, now is the time to define it.

OK! I agree with all you rationale that we should introduce this element now if we are to introduce it at all. So, let's go for it. The string api_hint=v1 sounds like a good compromise between clarity and brevity.

Will you insert the text into the spec or should I do this? Same PR or another one?

sauliusg commented 4 years ago

So, another model is to just make access to single resource objects stable over all versions of the API. URLs with type and id like: https://api.example.com/{type}/{id} gives stable URLs to the resource objects.

The utility and stability of these resource points is beyond doubt and I think we all agree on this.

The question is whether QS may/should be served on them as well? I argue that it should, because that is what most (all others?) REST interfaces usually do, and this would make system homogeneous and orthogonal.

Our main point of disagreement is "philosophical": I want to have system as permissive and flexible as possible, even if it will occasionally leave an inattentive user baffled; you want the system as stable and as possible, to make sure that inattentive users are protected from future mishaps, an you are ready to make it more restricted for the sake if this. Both approaches have their merit IMHO and both are good in different circumstances.

The price you are ready to pay is to refuse serving queries that are perfectly OK. For example, the https://example.com/optimade/v1/srtuctures?filter=nelements=2 might be perfectly serviceable under https://example.com/optimade/v10/srtuctures?filter=nelements=2 and thus under https://example.com/optimade/srtuctures?filter=nelements=2 for the span of, say, 10 major versions; even though in general filter language is incompatible. I personally do not want to get a "helpful" error message just in case my query is not guaranteed to work in general when I know that my query will work; I wan to get an answer – an then I figure out myself what to do with it.

The price I am ready to pay is the occasional unexpected result when the unversioned endpoint API changes to an incompatible one. I would be happy with a warning and best-effort response, but you apparently want to fence users from such possibility, even if this forces them to re-do their work by adapting it to the new API version...

The current compromise I think addresses both needs:

a) to be as stable as possible, you use versioned endpoints a-la /v1 and make sure user gets informative message when such endpoint is not supported;

b) to be as flexible as possible, to make access to the interface regardless of the version, and to cite queries, we have optinal unversioned endpoints. If you do not want them, you do not use them. I personally find them helpful to cite, e.g. in my slides, to make sure that the links are still valid some years after. With the suggested api_hint=v1 it can be made more stable. If you do not want the unversioned enpoint feature, you simply do not use it.

I hope we are settling on this compromise...

sauliusg commented 4 years ago

TL;DR

Further idle thoughts on future-proof development across major versions

So, another model is to just make access to single resource objects stable over all versions of the API. URLs with type and id like: https://api.example.com/{type}/{id} gives stable URLs to the resource objects.

You will also have https://api.example.com/{type} to get all (paged) entries of that type, right?

But QS are not supported on these enpoints... Woudn't it be confusing for all people who are used and striving for orthogonal interfaces (i.e. if I am allowed to have QS on a versioned endpoint of the current version, the same QS should work on the corresponding unvesioned endpoint)?

And we have no ways to document filters.

The problem, after some thought, seems to be split into two parts:

  1. How do we ensure compatibility of submitted parameters (QS filters, other QS params, POSTed data)?

  2. How do we ensure compatibility of returned data?

When we are talking about particular entity endpoints such as https://api.example.com/{type}/{id}, we seem to agree that a future major version will return response in a future format; however, since version will always be indicated in the /versions response as a simple number, you have formal means to automatically determine how to handle it (and browsers can use content negotiation); So this seems to be a sort of fine, doesn't it?

The problem arises when the QS specification will change across the major versions – in that case the server may not even be able to return any sensible result at all. That's why you would like to exclude QS on unversioned endpoints, right?

But without QS, we have no means to specify version-proof queries. When I publish a https://example.com/optimade/v1/structures?filter=nelements=2 link, I am sure that sooner or later the link will go away simply because the /v1 will no longer be supported. In contrast, https://example.com/optimade/v1/structures?filter=nelements=2&api_hint=v1 can stay even when API major versions progress to higher numbers;

We can discuss things as "translating" an old QS into a new one. But in practice, all that means is that you are retaining support for (at least part of) the older version.

No, that's the whole point of having the named queries!

When you stop supporting old filters, you translate old queries into new ones once, and serve only the new ones. E.g., for the sake of example, imagine v3 only supports YAML POST of the parse tree as a single query mechanism :) – you would then have a table that records old query (for documentation) and the new query which returns the same results. The new query may be a parse tree of the previous filter in the new supported format; but it can also be simply the list of identifiers returned by the previous query! Upon transition you check (once) that all translated queries return the same results as the old ones, set the OK flag and you are done. How to do this is up to the implementation, and can change over time, without changing the name https://example.com/optimade/query/{id}!

If we had such or similar mechanism, I would probably agree restricting unversioned endpoints to requests without a QS (even then, it would be quite unorthogonal!), but named queries are tough to implement...

So, just serve queries under the old version if you can, and return errors if you can't. If that support of older versions is handled via an internal translator, that is just an implementation detail.

The translator will be gone together with the version /v1, so this is not an option.

  1. When a query is done, we return as a part of response a https://example.com/optimade/query/{id} URI, or maybe even redirect to it, and this becomes a citable future-proof query.

Again, I don't see why this is necessary. Why can't that {id} just be the query itself (including the QS)? If the problem is "it may be too long", just use a hashing function.

Outlined above.

sauliusg commented 4 years ago

As we have agreed elsewhere, we should document an optinal api_hint=1 parameter. Do we include it into #290 or file a separate PR?

rartino commented 4 years ago

As we have agreed elsewhere, we should document an optinal api_hint=1 parameter. Do we include it into #290 or file a separate PR?

It is in #290 now.

Our main point of disagreement is "philosophical": I want to have system as permissive and flexible as possible, even if it will occasionally leave an inattentive user baffled; you want the system as stable and as possible, to make sure that inattentive users are protected from future mishaps, an you are ready to make it more restricted for the sake if this. Both approaches have their merit IMHO and both are good in different circumstances. [...]
I hope we are settling on this compromise...

I agree, this perfectly sums up the situation. I think it is of major benefit to OPTIMADE if we can accommodate both these views. And it indeed seems as we are settling on a working compromise in #290.

You will also have https://api.example.com/{type} to get all (paged) entries of that type, right?

As per #290 right now, no. I was contemplating this, but I just don't see a use case for a version-agnostic permanent link to list all resource objects of a type in a database with no filtering capabilities. And the interface for paging may change, adding to the difficulty.

When a query is done, we return as a part of response a https://example.com/optimade/query/{id} URI, or maybe even redirect to it, and this becomes a citable future-proof query. Again, I don't see why this is necessary. Why can't that {id} just be the query itself (including the QS)? If the problem is "it may be too long", just use a hashing function.

Well, the query itself (including the QS) is not enough because, as you have said also in your post, filter language may change in the future. If we agree that QS is not a part of the name then it should not be used to identify a resource.

Sorry, I meant, the query including the version path segment. My point is that everywhere you intend to use these generated query ids, you could just as well use that string. That is a unique identifier, with a unique interpretation.

As long as you support that version, you can interpret the string to recreate the query. When you drop support for that version, it just becomes an abstract identifier for the query.