hasura / graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
https://hasura.io
Apache License 2.0
31.18k stars 2.77k forks source link

Relay relationship ids #5035

Open ro-savage opened 4 years ago

ro-savage commented 4 years ago

Continues from Relay PR

The relay implementation of relationship ids makes it impossible to match two related objects that are queried separately.

In the database the ids look like Person {id: 1, role_id: 4} Role {id: 4}

From current hasura query result Person: {id: 'cGVvcGxlLTE=', role_id: 4} Role: {id: 'cm9sZXMtNA=='}

Now there is no way to match role_id: 4 to the role, because the ids are different.

Hasura should be returning role_id as the relay id instead Person: {id: 'cGVvcGxlLTE=', role_id: 'cm9sZXMtNA=='} Role: {id: 'cm9sZXMtNA=='}

The workaround right now is to run this query { person: { role: { id } } } instead of {person: { role_id } }. However the work around adds an extra unnecessary joins because the end user may already have the role data from a previous query.

While this may be small here, our real world use-case is much larger.

We have thousands of Assignments each is linked to a person, role and project.

We already have every person, role and project on the frontend from our initial relay query. But when I open a project I now want to view all the assignments.

Using our current relay setup, we just query like this assignment_query(project_id: $hash) { person_id, role_id, project_id }

However, with the way Hashura is set up because you can not query using the relationship ids, and because you can not see the relationship ids as relay ids we'd have to make a query that looks like

node_id(id: $project_id_hash) { person: { id }, role: { id }, project: { id } }

We've now gone from a database query that was simply SELECT id, person_id, role_id, project_id FROM assignments WHERE project_id = 1

to

SELECT assignments.id, people.id, roles.id, projects.id, 
FROM projects
JOIN assignments ON project.id = assignments.project_id
JOIN people ON assignments.person_id = people.id
JOIN roles ON assignments.role_id = role.id
WHERE project_id = 1

(Playing with this on production data, with a simpleq query that returns a large amount of data. Having to use role.id, project.id, person.id instead of role_id, person_id, project_id means the query returns about 20% slower)

We have a large project that has been using relay for years, and we have many many similar use cases in our project. Because we deal with large datasets, which we display in many different ways and are looked at through different lens. We tend of have lots of data already cache, and we always check out local cache first for the data by checking looking at the relationship_id and then using it to search our relay cache.

This is likely scenario that any large application using relay encounters many times.

In all other relay server frameworks I have used that interact with SQ, they always translate between the relayid and the database id automatically. So that on the frontend you only have to ever worry about one id (the relay id)

ro-savage commented 4 years ago

Example with subscriptions

When using subscriptions, I get will get my assignment returned with the database relationship ids{id: cm9sZXMtNA==', role_id: 3, project_id: 1, person_id: 6} however to update the relay store and ensure my assignments appears under project.assignments, role.assignments and person.assignments, I need to know the relayID for for each.

ro-savage commented 4 years ago

Example with mutations.

I create a new assignment via mutation and I need to give it the ids of the role, project and person.

However, hasura only accepts the database ID not the relay id. So instead of being able to do something like {role_id: role.id, person_id: person.id, project_id: project.id} I need to base64 decode the id of each object, to get the database id.

{ 
role_id: JSON.parse(atob(role.id)).columns.id, 
person_id: JSON.parse(atob(person.id)).columns.id, 
project_id: JSON.parse(atob(project.id)).columns.id
}
ro-savage commented 4 years ago

in response to the comment by @rakeshkky here

I think there's no point in having Role table information in Person table's column (role_id). Let's say if we are querying just the data from Person, having opaque base64 encoded value for role_id doesn't make it clear. Also, it is always assumed that role_id should match with Role's id.

I don't follow the logic here. An id is a way to identify values and objects. It doesn't matter how that id is derived, just that the id is always the same.

The fact that the id is a base64 encoded of the postgres path isn't relevant. Only that id === id.

In this case role_id = 2 and role.id = base64Hash. Means these ids are no longer equal, therefore they are no longer identifiers.

Futhermore, if you just have role_id = 2. You can not create the hash because you are missing information like the schema + table name (and have to know how hasura has chosen to represent the values).

I don't understand why you would ever want the same ids to be shown as two different values and what purpose that would serve?

tsnobip commented 4 years ago

I know it's a pain to do that for all your tables, but you could rename the ID column or use a different graphql name than id, then you could fetch both relay and db IDs.

ro-savage commented 4 years ago

@tsnobip - Thanks for the suggestion. However, what important here is the id that relay uses. To find things in the relay store, you need the relay id.

When you have person: {id: relayid, role_id: 3} and you need to search the relay store for the corresponding role, you are required to have the relay id.

Having the role as role: { id: relayid, dbId: 3 } doesn't actually help with that situations.

(There are however, other situations where having the db id is useful, and making graphql only name is a good way to get that additional information)

tsnobip commented 4 years ago

oh yeah I know what you mean, you'd indeed need to fetch { person: { role: { id } } } I guess, I'm very new to Relay so I'm trying to figure things out myself!

ro-savage commented 4 years ago

@rakeshkky @coco98

Is there any feedback on this and #5036 ?

I understand if you decide this is not the way you wish to handle Hasura with Relay. But if that is the case, it would be great to hear that and close the ticket.

Especially for our company, as we have been holding off work on our relay backend, so we could migrate to Hasura once relay support is released. However, if it's decided that Hasura will not support this, that means we can not use Hasura and will get back to work on our custom graphql backend instead.

Thanks, I appreciate all the great work the team has put in on Hasura.

0x777 commented 4 years ago

Hi @ro-savage, apologies for not responding to this earlier. This is great feedback! Thanks for taking the time to write this all down. Is this a fair breakdown of what's missing currently?

  1. Fetching a relationship's ID without a join.
  2. Using IDs to query data.
  3. Inserting/Updating data using IDs.
ro-savage commented 4 years ago

Correct.

The basic id is that Hasura in the translation layer for ids between the Database and the relay id.

React/Relay -> All ids are the Relay UUID. Hashura -> translates between Relay UUID and Database tables + ids Postgres -> Uses postgres table/id

I would expect that from my front-end with relay I should be able to:

  1. Query another a related object using the id (e.g. assignment.person_id should be the id I can query hasura/relay with)
  2. Match cached objects by id (e.g. if I already have all the people cached, I should be able to match that as person.id === assignment.person_id)
  3. Insert/update based on ids (e.g. I should be able to update the assignment <-> person relationship. such as assignment.person_id = person.id)

Here is an example of query times when you need to query assignment.person.id vs assignment.person_id. (I of course, am unable to test how long it takes for hashura to do the id -> relay id translation)

image
0x777 commented 4 years ago

The basic id is that Hasura in the translation layer for ids between the Database and the relay id.

Fair enough. However these changes are a little invasive and need a little bit more thought than we can afford to for the current release (1.3) so we plan on tackling these in the 1.4 release.

mrjbj commented 4 years ago

Thanks for this thread. I'm trying to use hasura-relay 1.3.0-beta.4 and am completely stymied in my attempt to write a react combo-box that updates an integer FK value to corresponding PK from related table. I think the issues described here apply to this use case also, though perhaps ro-savage has at least come-up with a way to work around, perhaps. @ro-savage, do you have a gist or code sample you could share that shows how to do update mutations on FK ids? (Ideally, I could use the UPDATE_tableName_BY_PK resolvers generated by hasura for this, but I can't figure out how to make it work.

ro-savage commented 4 years ago

@mrjbj - We've chosen not to implement Hasura until we can retrieve relay ids, as there would be a huge amount of re-work for us. So I am not sure the best way to implement in practice.

In theory you can query like this to get both the relay and db id. So you can use whatever you need at the time. {id: 'agzb7a1', person_id: 1, person: { id: 'nhzb7a1'}}

However, it's really messy to be keeping track of two separate ids and changing between them depending of you are doing a mutation, a where query, or searching the relay local store, or querying directly on an object.

ro-savage commented 4 years ago

@0x777 - Just wanted to see if there is any update on this.

We would love to move over to Hasura, but the current relay implementation makes this impossible for us (and others).

As always, happy to test out and give feedback on any PRs

mrjbj commented 4 years ago

Here's a mini-article I wrote to solidify my learnings on this topic after having had to bang my head against the wall trying to understand all of this. The root confusion in my case is that I had named the PK of my tables "id" using the pull down from Hasura's "frequently used columns" helper from the "add a new table" screen. One of the "frequently used columns" is "id" as primary key identifier and if you're not sure of the ramifications of doing this with the relay endpoint, you can run into confusion. Hope this helps someone else.

HasuraRelayId.pdf

0x777 commented 4 years ago

We'll pick this up immediately after 1.4 release (this was originally slated for the 1.4 release but we couldn't get around to this in time).

buzinas commented 3 years ago

Any updates on this @0x777?