Propose renaming id_v2 to something more permanent, and change type

magento / architecture

A place where Magento architectural discussions happen

275 stars 153 forks source link

Propose renaming id_v2 to something more permanent, and change type #396

Closed DrewML closed 4 years ago

DrewML commented 4 years ago

This (if approved) impacts an in-flight PR: https://github.com/magento/magento2/pull/28210

These changes are based on discussions I've had with @prabhuram93 and @nrkapoor.

There were a few concerns I wanted to address:

We should be using the ID type for IDs, because we don't want clients using them for anything but lookups. Mentioned in a separate design document
The document mentioned that id_v2 was a temporary field until we start using UUIDs. If we use the ID type, though, we shouldn't need a temporary field. If a client deploys a new version of the GraphQL API (after we switch from base64 >> uuid) and a client has cache entries for the old IDs, those old entries just won't be looked up in the client-side cache anymore. No loss/corruption should happen

I believe this matches the vision for every entity to have a globally unique ID, but with less downstream changes for clients. Please correct me if I'm wrong!
@nrkapoor and I discussed various names for the field. I've gone with uid in the PR, but happy to change if we can reach consensus on a different name. Here's the names we discussed:
- uuid: Bad because UUID is a specific encoding algorithm, and a client should not care about that
- gid: "Global Identifier"
- uid: "Unique Identifier"
- _id: Just kind of ugly
- identifier: Verbose, but accurate

Note: It's worth knowing that popular GraphQL client libraries (Apollo/urql) have default cache configurations that look for the field name id or _id on Object Types. Whatever new field we choose, we should be consistent going forward, as each headless solution will now need to customize their client to use this new cache key. See https://www.apollographql.com/docs/react/caching/cache-configuration/#custom-identifiers

paliarush commented 4 years ago

We will have two naming approaches in the system for some time.

Since the proposed name uid is not giving any substantial benefits, like being the default cache key for client side libraries, lets look at two pairs of names in the system:

uid and id
id_v2 and id

It looks like the question boils down to which pair is less confusing.

In my personal opinion the second option is less confusing because it follows the same versioning principle we already accepted for all fields. It is also intuitive.

If I was a developer seeing uid in some places, id in other places for the first time, it would raise question "why".

DrewML commented 4 years ago

@paliarush Thank you for the feedback!

It looks like the question boils down to which pair is less confusing.

IMHO, there is one additional consideration here: how this field looks in the schema in 5 years? I would hope that 5 years from now, we have one, well-known field to represent the ID of an object type (rather than id_v3, id_v4). Adding a version to the ID field feels like we're implicitly agreeing to a versioning scheme for this field, and I'm not convinced that's desirable.

Something like uid/guid/identifier/etc is not versioned, and ships with the message "this is now the ID field moving forward, so look for this field"
Something like id_v2 says "this is the ID field today, but it's likely to change in the future, so keep an eye out for id_v3".

Unlike other fields that are specific to their domain objects, IDs are a very general API feature. I'm of the opinion that this should be the last time we change this field name. Is there a scenario where we ever see ourselves having an id_v3 (Remember that ID type protects us from breaking changes due to format changes)?

It is also intuitive.

I would (respectfully) argue that uid is a pretty standard abbreviation for unique identifier , so I'd imagine a good chunk of developers would at least know uid == "some form of ID." At that point, they'll have the same question as id_v2: "why is there a new ID field now?" I think that problem is unavoidable.

paliarush commented 4 years ago

From the history of Magento we can assume that deprecated fields may stay there for more than 5 years.

I am not saying that the name uid is not intuitive by itself. I am just saying that when developer sees id in some objects, uid in other objects - it may raise more questions than seeing id and id_v2. This statement is subjective though.

Additionally, there is a chance of id_v3 in the future due to some semantical changes, which may affect clients using id not only for caching purposes, but also in business logic. In this case I would also prefer id_v3 instead of coming up with yet another alternative naming for the same field.

One solution could be to rename all ids in the system to uid and deprecate the rest. This will bring consistency and keep the name without the suffix. Can the client have a generic override for cache key saying that all objects will have uid and not id?

DrewML commented 4 years ago

From the history of Magento we can assume that deprecated fields may stay there for more than 5 years.

Totally separate (likely hours long) discussion, but sounds like we need some more general guidelines for how we handle deprecations (remove after X versions, X years, X decades, etc). But I know better than to try and address that in this ticket!.

I am not saying that the name uid is not intuitive by itself. I am just saying that when developer sees id in some objects, uid in other objects - it may raise more questions than seeing id and id_v2. This statement is subjective though.

I think it's safe to say that we both agree that using almost any name besides id, while also still having the old id fields, is confusing, which is good that we agree and bad that it's confusing 😆

Additionally, there is a chance of id_v3 in the future due to some semantical changes, which may affect clients using id not only for caching purposes, but also in business logic. In this case I would also prefer id_v3 instead of coming up with yet another alternative naming for the same field.

This is the part that I'm skeptical about, and really the driving motivation behind this PR. Can you give some examples of semantic changes that would lead to us introduce another ID field on an object? I think this is an important discussion to have.

Can the client have a generic override for cache key saying that all objects will have uid and not id?

Yep! Typically just passing in a callback with the signature (object: ObjectType) => string | string[]. That's why I think it's important for us to get 1 stable ID for caching and commit to it. We know a bit of pressure on the API can lead to service degradation, so having consistent caching for the client benefits both shoppers and DevOps.

paliarush commented 4 years ago

Let's have a little broader discussion to rename all IDs to uid then and make sure we introduce uid to every entity (not every object though). Looks like we both are okay with this option.

DrewML commented 4 years ago

Let's have a little broader discussion to rename all IDs to uid then and make sure we introduce uid to every entity (not every object though). Looks like we both are okay with this option.

Sounds good to me!

nrkapoor commented 4 years ago

Adding @mhaack for visibility. @mhaack please provide feedback.

DrewML commented 4 years ago

Additional reading that's helpful:

Our API currently doesn't support the ability to lookup an arbitrary entity by ID, which is desirable for some GraphQL clients. A consistent ID/type will open up the option to support this in the future.

tjwiebell commented 4 years ago

@DrewML - I won't be able to make the discussion, but had an additional point I wanted to make sure was represented.

On the same theme as automatic caching using id fields, another common setup for Apollo clients is to include their eslint plugin into your app to help catch common mistakes (eg. id is required if available, no deprecated fields). As you might see with the two rule examples I provided, moving forward with an id field that is not useful to the app and is flagged as deprecated means anyone using our API is going to have to:

Manually setup Apollo caching so it uses the correct ID field (eg. uuid).
Everywhere that id is not the actual ID, in each query or mutation response of that type they'll need to eslint-disable // eslint-enable that whole operation to suppress one of those two eslint rules.

I don't know if there was still an argument for just allowing backwards incompatible changes to id fields, but wanted to provide some additional issues developers are going to run into if we proceed with the new field approach.

As a moonshot alternative, if we're going to introduce a new ID field, I propose we do it across all types and deprecate id everywhere. At least that way you could write cache maps and eslint rules to always expect uuid, instead of having to manually determine for each operation which id field is actually correct.

DrewML commented 4 years ago

I don't know if there was still an argument for just allowing backwards incompatible changes to id fields

I think everything is on the table. IMHO this is the best idea if the only motivation is consistency. But, I suspect we might not be able to reach consensus on a breaking change of that size.

As a moonshot alternative, if we're going to introduce a new ID field, I propose we do it across all types and deprecate id everywhere. At least that way you could write cache maps and eslint rules to always expect uuid, instead of having to manually determine for each operation which id field is actually correct.

That's definitely what I'm advocating for 👍 1 well-known ID field, for every entity.

DrewML commented 4 years ago

Meeting scheduled today for us all to discuss. Here's a quick summary we can review when we kick off the meeting:

Summary of problem so far: This change is motivated by 2 tangentially related issues:

Magento Architecture has a long-term vision to switch from DB-incremented IDs to UUIDs for all entities.
UI developers need guarantees for cache keys to create a stable client-side cache, lessening server load

Here is what has been discussed in this PR so far:

Many entities in the schema, addressable by ID, have a field named id with an Int type. These types are not opaque, and have locked us into being unable to switch to something like UUIDs
For client-side caching, UI developers would like 1, well-known key (example: uid) that can always reliably be used as a cache key (mentioned by @tjwiebell in some other issues/Slack discussions)
- If we can make this ID unique across all entities (like a UUID), we unlock other possible features like a general Query.node(id: ID) for granular cache updates
We cannot change existing fields named id without a one-time breaking change
- To work-around this, the proposal is to choose a new, well-known field name, and add to all existing and new entities moving forward
- A breaking change is not strictly excluded from our options, but unlikely to reach consensus on this.
@paliarush has some concerns about what happens if we end up having to make another ID-related change down the line, giving us yet another field name.

DrewML commented 4 years ago

We had a productive meeting, with some outcomes!

Attendees:

@paliarush
@nrkapoor
@danielrenaud
@prabhuram93
@cpartica

Outcomes:

We agreed that we do not want versioned ID fields (id_v{num})
We agreed to go with the field name uid as the unique identifier for object types that represent entities
We agreed that new entity types moving forward will get a uid field, and no id field
We agreed that existing id fields should be deprecated, and those types should have a uid field added with type ID
- We'll need to update mutations that accept these id values before we can deprecate the id fields themselves (some risk here)

Action Items:

I will look into the risks and path forward for updating existing mutations to work with the output of the uid field with an ID type
@nrkapoor will create a story to encapsulate the id deprecations and uid additions
@prabhuram93 will continue to work with folks on existing PRs around options/ids to help guide them towards uid and ID

DrewML commented 4 years ago

Regarding these items:

We'll need to update mutations that accept these id values before we can deprecate the id fields themselves (some risk here)

I will look into the risks and path forward for updating existing mutations to work with the output of the uid field with an ID type

Update: Definitely a breaking change, per feedback given in https://github.com/graphql/graphql-js/pull/2711.

query GetSomeEntity($var: String!) {
   field1(id: $var) # If "id" is an ID type, this operation would break because of the variable
}

Will think about alternative paths - most obvious one is deprecating the existing queries/mutations that take a String or Int and replace them with a similar query/mutation that takes a proper ID. Will collect a list and look at what the impact would be.

Original Comment Details (Incorrect)

I made a quick POC GraphQL server using the [GraphQL reference implementation](https://github.com/graphql/graphql-js), and tested changing a field argument from `String` or `Int` to `ID`. I ran a sample query both before and after the change, and verified the query still works. ## Test Schemas ### Before ```graphql type Query { field1(arg1: String): Int field2(arg1: Int): Int } ``` ### After ```graphql type Query { field1(arg1: ID): Int field2(arg1: ID): Int } ``` ## Test Query ```graphql # This query works with both the "before" and "after" schema { field1(arg1: "foo") field2(arg1: 1) } ``` This works because of the [Input Coercion rules for `ID` types](http://spec.graphql.org/June2018/#sec-ID) in GraphQL: > Input Coercion > > When expected as an input type, any string (such as "4") or integer (such as 4) input value should be coerced to ID as appropriate for the ID formats a given GraphQL server expects. Any other input value, including float input values (such as 4.0), must raise a query error indicating an incorrect type. http://spec.graphql.org/June2018/#sec-ID ## Breaking Change Detection The GraphQL reference implementation includes a tool to find breaking changes in a schema, and it currently flags this as a breaking change. To make sure I'm not overlooking anything, I've started up a discussion on that repo: https://github.com/graphql/graphql-js/pull/2711