facebook / relay

Relay is a JavaScript framework for building data-driven React applications.
https://relay.dev
MIT License
18.42k stars 1.83k forks source link

Reactive GraphQL Architecture #4687

Open captbaritone opened 7 months ago

captbaritone commented 7 months ago

Reactive GraphQL Architecture

This document outlines a vision for using GraphQL to model client data in applications which have highly complex client state. It is informed by the constraints of developing applications for the web, but should be applicable to native applications as well.

GraphQL provides a declarative syntax for application code to specify its data dependencies. While GraphQL was designed for facilitating query/response communications between clients and servers, it has also proved a useful mechanism for implementing client-side data loading from non-GraphQL servers. Implementing your client-side data layer as a GraphQL executor enables decoupling product code from the code which fetches data from a REST server. The GraphQL resolver architecture also provides an opinionated way to model the data layer, forcing it to be implemented in composable fashion, where the GraphQL executor is responsible for composing the individual resolvers together to derive all the needed data for a product surface.

Historically, this architecture is most often encountered in products where the front-end team sees value in the developer experience of GraphQL, but organizational or technical impediments prevent implementing the GraphQL executor on the server. However, we are starting to see other types of applications where this architecture makes sense for purely technical reasons. Examples include:

While implementing a GraphQL executor on the client can be an attractive architecture from a developer experience perspective, it creates a number of challenges in terms of efficiency. The rest of this post will describe a proposed evolution of this architecture which preserves its benefits while mitigating many of its challenges.

The Architecture

While the architecture is not prescriptive about any actual tools, Relay, with its compiler and generated code, is well positioned to explore this architecture. Ideally it can eventually be decomposed into distinct tools:

Problems Solved

Benefits Preserved

Open Questions

We are still early in exploring this architecture and some open questions remain:

Collaboration Opportunities

Sources

These ideas have been explored across various projects:

alloy commented 7 months ago

What should the semantics of mutations be in the context of a reactive GraphQL executor? Specifically, the response portion of the mutation is generally used to specify which updates the client would like to observe, but with a reactive executor we already expect to be notified of changes to any data we are currently observing.

In a world where we'd want to ideally eliminate all updaters, optimistic updates will need to be handled by the local data-store layer. Would this become entirely a concern of the application, or do you imagine Relay would still play a role in this?

alloy commented 7 months ago

Are there viable migrations strategies to incrementally adopt this architecture when coming from other existing setups?

This includes:

flow-danny commented 6 months ago

I was actually thinking about doing a little experiment implementing a Network layer fetchQuery that doesn't really fetch over HTTP, but gets data from local SQLite.

It would do this by running graphql resolvers as if it's a graphql server, avoiding all intermediate serialization.

After every commit from network updates, a crude way to make it reactive, could be to invalidate the Store?

captbaritone commented 6 months ago

In a world where we'd want to ideally eliminate all updaters, optimistic updates will need to be handled by the local data-store layer. Would this become entirely a concern of the application, or do you imagine Relay would still play a role in this?

Still an open question! Perhaps the answer is that you'll be able to do either? Some client data layers may have their own optimistic state mechanism. That would probably have the ability to be more robust. A higher ceiling.

Conversely some will not, in which case Relay's primitive which make sense for server data should still be available.

captbaritone commented 6 months ago

I was actually thinking about doing a little experiment implementing a Network layer fetchQuery that doesn't really fetch over HTTP, but gets data from local SQLite.

I did a prototype of something similar to this using Relay Resolvers which you can find here: https://relay.dev/docs/next/guides/relay-resolvers/introduction/

By using Relay Live Resolvers (experimental feature) you can invalidate values at a field or record granular level. For a crude start I just invalidated every value on every db update. But something like https://github.com/vlcn-io/cr-sqlite could probably get you something much more sophisticated.

flow-danny commented 6 months ago

With a generic entity schema it will be easy to know which Node IDs are invalid, but is there currently a way to invalidate only the active queries currently rendering those nodes?

flow-danny commented 6 months ago

I also looked at Relay resolvers, but they work so differently to normal resolvers... tied to fragments instead of the schema itself.

Skipping all the networking and JSON back and forth, a regular graphql-tools resolver will already give you subscriptions, which is basically reactive.

I'm sure its possible to stitch a server schema in there and have the client-side resolver do a regular network fetch.

The resolver could also be compiled using something like graphql-jit to reduce overhead.

captbaritone commented 6 months ago

I also looked at Relay resolvers, but they work so differently to normal resolvers... tied to fragments instead of the schema itself.

Sorry for the confusion. We've been expanding Relay Resolvers to enable them to model arbitrary arbitrary client state with field-level reactivity. I've just merged a PR which add documentation for this experimental feature. You can read more here: https://relay.dev/docs/next/guides/relay-resolvers/introduction/

Lalitha-Iyer commented 6 months ago

Abstracts away the distinction between server and client data for product code enabling a single declarative API for reading data.

Would this mean we would continue to use client state management libraries like Redux but provide a common abstraction for querying?

captbaritone commented 5 months ago

@Lalitha-Iyer

Would this mean we would continue to use client state management libraries like Redux but provide a common abstraction for querying?

Correct. You could continue to use Redux or similar for true client state or non-GraphQL data. However to get the advantages of Relay for network data (defining data dependencies inline without resulting in data fetching waterfalls during loading) you would want to ensure all network data was still coming from GraphQL running on the server.

ivosabev commented 5 months ago

I am not sure I understand the benefits of the Resolvers compared to the client schema extension, which could completely replace a local state library like Redux

captbaritone commented 5 months ago

@ivosabev When we first started looking at the problem, that's what we thought as well! But it turns out that while technically Client Schema Extensions could be used to model all local state like Redux, it's not very practical to do so once you have a meaningful amount of client state you want to model. You would need some layer which ensured that all the records/fields that your product code was expecting to read were pre-populated in the Relay store. As updates happen in your app, you'd need to write them to the Relay store, which is often not practical. For example, if you had a todo app and wanted to transfer all tasks owned by one user to another user you'd have a very trick set of tasks to do:

  1. Locate all the Todo records owned by the user (you'd either need to do a full scan or the store to do this, or define a separate query to read this out)
  2. Manually (without a good typesafe api) update each Todo record to have the new owner
  3. Locate the old user's record in the store
  4. Find their all_todos field, and empty it

    1. (Note that you might also need to update other fields like overdue_todos or all_todos(completed: true) which each would have their own field in the store)
    2. Locate the new user's record in the store
    3. Find their all_todos field and add the new items (potentially sorting them)

    Note that with the above you have ~0 type safety to help ensuring you don't miss something since Relay can't know all the places that need to be updated.

This is a lot of complexity. In practice we've found it's much nicer to have the source of truth of your data be in some sensible reactive data store (like Redux, or similar) and then define functions which model how it gets projected into the graph schema (resolvers), just like a server would do. With this approach you get one way data flow, where you update the source of truth (the owner_id property on the todo items) and all the relevant fields that are currently being read automatically get recomputed.

leoasis commented 5 months ago

What should the semantics of mutations be in the context of a reactive GraphQL executor? Specifically, the response portion of the mutation is generally used to specify which updates the client would like to observe, but with a reactive executor we already expect to be notified of changes to any data we are currently observing.

This reminds me of the original idea in Relay classic about the mutation with "fat queries", where the library would calculate the intersection between all queries and the result of the mutation. This was somewhat inefficient and also non deterministic which was prohibitive for client to server architectures since it would imply potentially ad-hoc queries (not possible to persist ahead of time), but maybe that is not a concern in a local first architecture? Another alternative would be to make the compiler smart enough to calculate the potential intersection of the result and the active queries (with some extra hints maybe), but that looks like it could explode very easily in complexity, and in the amount of artifacts it would output.