RFC: Normalize embedded documents based on parent

BickelLukas commented 3 months ago

Summary

Given the following graphQl Schema

type Entity {
  id: ID!;
  fields(include: [String!]): [Field!]!
}

type Field {
  key: string;
  value: string;
}

the fields can be filtered by providing include which filters the returned fields by key.

Now we have a detail component that loads all fields without providing the include parameter and a Table component that only loads a subset of the fields.

When the Detail view now updates the Entity via a mutation, the fields on the detail view get updated, but the fields in the Table remain the same because the fieldKey contains the include parameter and the Fields themselves are not normalized but embedded per Entity.

The fields can not be globally normalized by key because they are different depending on which entity theyre on, but they should be normalized within the scope of a single entity.

Proposed Solution

Passing the parent to the keys function would allow construction of a composite key for a Field

keys: {
  Entity: (entity) => entity.id,
  Field: (field, parent) => entity.id + parent.key,
}

other solutions are also welcome

kitten commented 3 months ago

It's a little hard here to differentiate between what you're expecting, what you're trying to do, what your use-case is, and what you proposed as a solution, so maybe you pre-empted this issue description a little, since it combines a lot of what I mentioned at once. But I'll give it my best to describe what I think are some angles to a solution here that already work today ❤️

Re. parent-to-child relations. It's unclear to me what you mean with the disconnect between parent and child. Any field key is irrelevant here, because that's just a relation and the normalisation happens on entities (i.e. entity keys).

This means that, typically, if you have a parent-child relation (be it 1:N, or 1:1, etc), you can create edge objects that help with normalisation of fields on the relation itself. This is a really underappreciated pattern in schema design, and helps defend yourself against denormalised schemas that don't accurately reflect some UI dependencies and data.

For example,

type User {
  # ...
  likedBooks(someFilter: FilterInput): [Book]
}

type Book {
  # let's say this is unkeyable because of the following field:
  likedByUserAt: DateTime!
}

This often leads to issues compared to:

type User {
  # ...
  likedBooks(someFilter: FilterInput): [LikedBookEdge!]
}

type LikedBookEdge {
  book: Book!
  user: User!
  likedByUserAt: DateTime!
}

type Book {
  id: ID!
}

in this example, Book and User are completely independent from their directional edge-field likedByUserAt, so the UI and cache are able to differentiate, even when multiple users and a single book have multiple ways to connect to each other. (N:1 reverse case, basically)

But, it's hard to tell for me whether this is applicable to your situation. Specifically,

Now we have a detail component that loads all fields without providing the include parameter and a Table component that only loads a subset of the fields.

^this section doesn't tell me enough about why your schema and the normalised cache together aren't working out in your specific schema design case.

Passing the parent to the keys function would allow construction of a composite key for a Field

This is actually already possible. We recently have loosened some restrictions to create a blanket solution for these problems.

You can now pass the globalIDs config to Graphcache, which basically allows you to remove __typename from entity keys and hence lift restrictions that stop you from creating overlapping keys: https://commerce.nearform.com/open-source/urql/docs/api/graphcache/#cacheexchange

That said, I'd still recommend using Edge-type objects schema design, because:

{ Field: (field, parent) => entity.id + parent.key }

is a strong indication that globalIds aren't sufficient because your schema design necessitates treating objects as denormalised.

BickelLukas commented 3 months ago

@kitten Thanks for the quick reply. I understand your approach with edges but I'm not quite sure if that's applicable here.

Here's some more context on what we're building:
In our system, administrators can dynamically add fields to an entity. You can imagine it kind of like a CMS where the administrator designs the schema and the users can then fill in data.

Without graphql the api could just return the fields directly in the object:

{
  "id": 1,

  "title": "This is a title",
  "description": "detail text",
  ... dynamically added
}

but since graphql requires us to specify the exact schema of the data which is not known ahead of time we opted to return the fields as a list.

{
  "id": 1,
  "fields": [
    { "key": "title", "value": "This is a title" },
    { "key": "description", "value": "detail text" },
    ... dynamically added
  ]
}

so the fields collection is not an edge to another entity but rather just a way to expose dynamic data via graphql.

Now when we execute the following 2 queries, I would want the same value for the "title" field to be returned for both queries.

query CompleteEntity($id: ID!) {
  entity(id: $id) {
    id
    fields {
      key
      value
    }
  }
}

query PartialEntity($id: ID!) {
  entity(id: $id) {
    id
    fields(include: ["title"] {
      key
      value
    }
  }
}

You can now pass the globalIDs config to Graphcache, which basically allows you to remove __typename from entity keys and hence lift restrictions that stop you from creating overlapping keys: https://commerce.nearform.com/open-source/urql/docs/api/graphcache/#cacheexchange

We are actually already using globalIDs and our API follows the relay schema.

JoviDeCroock commented 3 months ago

Reading over this I wonder why that's not being returned as the JSON sacalar. That being said, this could just be an embedded entity by keying with () => null as arbitrary data like that shouldn't be subject to updates either way

kitten commented 3 months ago

Edges can still help here, because you can have entity -> fields(...) -> field edge -> field where field edge has a unique ID specific to both entity and field. Or alternatively, field has an ID field that both identifies the parent and the field key.

This by definition though is denormalised data otherwise. That's comparable to having a Field table in a database that doesn't have a field to identify the Entity.

The problem with customising embedded entity keys is that, instead of forcing good schema patterns, we'd make a case that's hard to configure and hard to customise, in that, this would have to be separate from keys and only trigger when a certain parent-child relationship occurs. Which would then likely cascade into feature requests for grandparent-parent-child relationships.

BickelLukas commented 3 months ago

It doesn't quite feel right, but I can solve it by adding the entityKey to the Field type like you suggested. Thanks!

urql-graphql / urql

RFC: Normalize embedded documents based on parent #3552

Summary

Proposed Solution