Support for defer/stream

maraisr commented 3 years ago

Raising this issue with goal of gathering feedback about Apollo "re-adopting" the defer/stream capability. There has been huge momentum in recent months around this. graphiql now supports defer/stream out of the box and houses a sample implementation for this. We are also seeing leaps being made over in the spec https://github.com/graphql/graphql-spec/pull/742 So its only a matter of time before this will become an expected feature.

I know that this is scheduled for 3.5; but wondering If this is something that I can help contribute back across the line? I gave this is a fair crack over at apollo-link-multipart using meros so have had some experience in adding it. But believe adding this into the core would enable greater adoption.

Links~

grpahiql support: https://github.com/graphql/graphiql/pull/1770
https://github.com/graphql/graphql-spec/blob/main/rfcs/DeferStream.md
https://github.com/graphql/graphql-over-http/blob/main/rfcs/IncrementalDelivery.md
Sample server implementation: https://github.com/contrawork/graphql-helix
transport handler: https://github.com/maraisr/meros

maraisr commented 3 years ago

@acao thought you'd be interested here

benjamn commented 3 years ago

@maraisr Thanks for the nudge! At a very high level, I think the link chain is the right place to handle the multipart HTTP requests, so your past work may be largely transferrable/adaptable, either by integrating this functionality into the default HttpLink class, or by having a separate @apollo/client/link/multipart link package (let me know if you have any thoughts/preferences about that). On the Apollo Client side, we need to make sure all the right things happen when a link delivers multiple results over time, but I'm optimistic that functionality is already mostly in place.

maraisr commented 3 years ago

Personally I feel this probably belongs in the HttpLink, but really can go in either place. The defer/stream directives are optionally implemented by the server, which means if a server exposes that we can go with;

Apollo will "be ready" in that all HttpLink's will support that transport.
If a server supports it, it'll be opt in client support.
- My only gripe with this one is that IDE tooling and what-not will give the user the illusion they can use those directives without potentially knowing there also needs to be transport changes.

So with that; my vote is in HttpLink—meros is transparent to this, so for clients w/o defer/stream support they'd know no difference.

Let me know what you wish and I can surely add a PR; @benjamn

acao commented 3 years ago

agreeing with @maraisr here, similar situation with urql. IncrementalDelivery is unrelated to the multipart upload transport spec, because it’s simply a new variant on the possible response types for the http spec that HttpLink already handles.

This spec is already supported by graphql servers other than apollo on projects where folks use your client, such as graphql helix, postgraphile, absinthe, etc.

That said, from a design perspective, the link works better at least for an experiment, and maybe once the spec is finalized and at the right level, then the behavior could be merged to HttpLink?

maraisr commented 3 years ago

I have embarked on this journey! 🎉 but do need some clarification.

I think for this to stay backwards compatible the current link chain apis should stay consistent; in that they get given the response result object (data, errors, ...). How do we see incremental payloads get passed? My thinking here, "we build it up"—in that initial response builds out the data, errors object, we yield that, when the second payload arrives, we deep merge with the previous and yield, so on and so forth:

eg for the query:

{
    firstName
    ... @defer {
        lastName
    }
}

yield { data: {firstName: "bob"} }
yield { data: {firstName: "bob", lastName: "the builder"} }

query MultiDefer {
    user {
        firstName
        ... @defer(label: "lastName") {
            lastName
        }
        contact {
            ... @defer(label: "contact_phoneNumber") {
                 phoneNumber
            }
        }
    }
}

yield { data: {firstName: "bob", contact: {}} }
yield { data: {firstName: "bob", contact: {}, lastName: "the builder"} }
yield { data: {firstName: "bob", contact: {phoneNumber: 123}, lastName: "the builder"} }

Secondly; Where do we think this ends? I mean like, is this simply in transport—or can we leverge some of the information given further down? IncrementalDelivery spec defines an label arg that can be used to identify the payload that came in—from others. Granted im not 100% aware of the Apollo api around fragments, now Relay has this concept where components can useFragment and some other parent component does the useQuery—so just checking if there could a use-case here to use the label to for instance notify all fragments about changes, and not others.

eg with the MultiDefer query above, one component has the phoneNumber, and another the lastName.

Seeing as defer's are raced on server side, the phoneNumber could come back before/after the lastName. So in terms of client updates if one component only cares about the phone number the lastName component gets a useless update.

Now as far as I understand useQuery this is a complete document, and fragments are shared through interpolation for DX sharing, but not so much for masking. So this concept may not be useful??

Thirdly; In terms of the useQuery api, the result "loading: boolean" would this be true till the last incremental payload arrives? Thinking here yes, and NetworkState could have a "incrementalPayloads" or something flag. Also in the polling stuff, how would this work? Do we kick off a new poll interval once the last payload arrives, which could then potentially not marry up to interval defined?

Further this, I'm almost tempted introduce a useAsyncQuery (name??) that does this in a sane way such as introduce a result variable hasMore: boolean to indicate if all payloads have arrived, and not support polling or support it within caveats around timings etc... And then for useQuery it'll effectively then await for the last payload before yielding.

Ill most likely have more questions, so would appreciate some time with ya'll in terms of direction or to spit ball ideas on how to bring and fully leverage this within the client—and once again, happy to spearhead the implementation (progress: https://github.com/maraisr/apollo-client/tree/incremental-delivery).

cc @kkemple @abernix @acao @benjamn

benjamn commented 3 years ago

@maraisr Sorry for leaving you hanging for so long here!

As long as the incoming data payloads get written into the InMemoryCache, all queries and fragments that are watching the cache will be notified automatically of the new data. This could either mean writing the entire result multiple times, with incremental patches applied, or somehow writing only the data that have changed. The latter approach might be more efficient, but trickier. I would prefer for it to work without relying on the label, but I see how label could help.

I think we will end up introducing a new NetworkStatus.partial enum value to indicate that a query result is still missing some deferred parts. The NetworkStatus concept is something that exists in ApolloClient outside of ApolloLink, so I don't think we can terminate everything within ApolloLink.

While I hope that answers at least some of your questions, and I very much appreciate you sharing your branch as a proof of concept, I have to warn you that we (the core team) will probably have an easier time navigating the internals of Apollo Client than you (our fault, not yours), so it might make sense to hold off on a full PR, to protect your personal time. We will look for a ways to use meros (thanks for that), and I will definitely be reaching out to you for your review, when we have something ready for review/testing.

Scheduling-wise, this is a priority for the v3.5 release (I got it moved up from v3.6), so I hope you won't be waiting long. Thanks again for going above and beyond to get the ball rolling on this project.

hwillson commented 3 years ago

Let's re-open this for tracking (we're aiming to have this in @apollo/client@3.5).

hwillson commented 3 years ago

Linking to https://github.com/apollographql/apollo-feature-requests/issues/3

brainkim commented 3 years ago

Howdy all. I’ve been tasked with implementing defer/stream for Apollo Client. I’m going to piggyback this issue to air my specification questions.

Specification Issues

The following is a growing list of questions and concerns about parts of the specification which I personally believe are underspecified, and should be brought to the various GraphQL committees.

Do GraphQL server implementations provide any guarantees with respect to patch path ordering?

For instance, would it be possible to receive a patch whose path was ["a", "b"] before a patch whose path was ["a"] (topological sorting)? Additionally, for streams, is it possible to receive patches whose indices are out of order (["a", 1], ["a", 0]), or to receive an ordering which indicates a sparse list (["a", 2], [“a”, 4])? It seems that for defer, we can definitely receive elements out of order.

This may be addressed in the execution part of the specification. It would have ramifications for Apollo Client insofar as if we receive patches out of topological order, we may have to create intermediate entities, sort of like how mkdir -p creates directories which don’t exist along a path.

What are the write semantics of defer and stream?

Defer and stream have slightly different semantics when it comes to what we do with the object under the path. Because defer is only used with fragments, the payload data should essentially be merged into the object under the path, while stream seems like it’s meant to overwrite any existing values because the value shouldn’t exist for that specific operation yet. Do defer and stream patches have explicit write semantics which clients should follow?

How does this work with normalized caching? If we defer fields which are essential to identify a normalized entity (id, _id), what is the identity of the entity which we already have? Is there a way for servers to mark specific important fields as un-deferable? Should fields with ID types never be deferred?

One potential issue with defer and stream for Apollo Client is that if an entity’s key fields are deferred/streamed, we’re not really going to know how to identify and cache the partially deferred entity. Maybe we can reuse some logic around optimistic mutations.

Finally, what are the semantics of result ordering when there can be multiple operations which can return incremental results in any order. In other words, if query A is requested before query B, but query A is an incremental request and query B completes before the deferred parts of query A, should we consider the deferred updates to entities in query A as being the latest updates, or should we consider query B’s data as the latest for all entities?

Do defer and stream directives make sense for non-query operations?

What are the use-cases for defer and stream in mutations or subscriptions? There is this issue (https://github.com/graphql/graphql-spec/issues/783), and it looks like at least defer is implemented for mutations in the graphql-js repository.

I think supporting defer and stream for queries and mutations is fine, but for subscriptions it sounds like a nightmare with many potential edge cases. We are not helped by Apollo Client’s patchwork support for subscriptions.

What is the relationship of the stream directive to pagination and subscriptions?

To me, it doesn’t feel like the concerns of “pagination,” which is typically argument driven, are fully separate from concerns of “streaming,” which is directive driven. The main differences are that with streaming, there isn’t any way to specify offsets or limits (the initialCount argument is entirely optional and apparently doesn’t even have to be respected).

What are the expectations with regard to the duration of a deferred/streamed response? Can the client end incremental delivery abruptly? Are there limits to how long a response can remain open? If a list can be streamed indefinitely, doesn’t this overlap with the functionalities provided by GraphQL subscriptions?

The growing consensus at Apollo is that we should adopt on the incremental delivery and defer directive first, and assess stream for later, because of these sorts of questions.

What is the relationship between the incremental delivery proposal and the batching proposal?

To be clear, “batching” is a way for GraphQL clients to send multiple operations to a server, by serializing multiple operations together as a JSON array, and then receiving corresponding results as a JSON array. This is confusing because the incremental delivery proposal also has a “batching” aspect, where an individual response frame can contain multiple patches, serialized as a JSON array. As far as I can tell, no one’s given much thought yet into how operations which contain defer/stream directives could be batched in the former sense.

How should we handle fields which are deferred/streamed and requested immediately within the same operation?

You can imagine queries like the following:

{
  greeting
  ... on Query @defer {
    greeting
  }
}

Are there any special considerations for clients here?

What is the relationship of defer/stream to non-nullability and code generation?

Insofar as deferring or streaming fields can cause fields/elements of lists to be missing, is it a mistake to allow non-nullable fields or lists to be deferred/streamed?

This is less of a concern for Apollo Client, because we’re not the ones doing the TypeScript codegen anymore, but it’s been raised as an issue by our mobile developers, who work with slightly more strongly typed languages like Kotlin and Swift. I’m also worried about the TypeScript side of things, insofar as whatever API we choose, we’ll have to coordinate for codegen support.

How do errors work with incremental delivery?

GraphQL ExecutionResult objects can contain partial errors, which, like patches, have paths. Should these paths be relative to the root or to the path of the patch? Can errors arrive in the initial result for paths which are deferred/streamed?

Why shouldn’t we allow individual fields to be deferred?

The defer directive is restricted to fragments and not fields). As far as I can tell, the justification is that developers might unwittingly defer too many fields individually and cause poor user experiences. I’m wondering because I personally think this is an artificial limitation, and that it should be allowed for fields as well. This is not a deal-breaker for me, however, and I’m willing to defer this issue til later.

As you can see from #8184, the current plan for Apollo Client is to implement the Incremental Transport and defer directive first, and leave stream til later. This is because of some open questions with regard to stream which we’d like to get sorted. If any GraphQL spec people or Apollo Client users have any thoughts on this. Please let us know!

maraisr commented 3 years ago

Hi @brainkim thanks for doing this.

Ill try my best to answer your questions!

Do GraphQL server implementations provide any guarantees with respect to patch path ordering?

Yes it should, defer doesn't care defer will never send items in an array, but only its properties. The only point there is the object should mirror the query, so if properties are sent out of order, i dare say this is an oversight and is okay to ignore. With respects to streams, that is enforced [a, 0] must be flushed before [a, 1] from a server.

What are the write semantics of defer and stream?

So with the data coming back re streams dset was groomed for this exact case, its basically a deep merge at path. You can stream with defer, so a stream will always .push, but then a defer may come in (after) and set some more values.

Re the ID question, Relay where its used will always sync ask for id, irrespective of a users choice with the directive. But still a valid point, and will raise with committee.

I think with your ordering question, the idea is to use the label field to figure out which one came back. But once again a point ill raise with committee, Relay doesn't know anything about multiple operations. My guess is its simply wont be supported, that or operationName would mean only its defers would matter.

What is the relationship of the stream directive to pagination and subscriptions?

Yeah agree, think defer on subscriptions wouldnt make sense. And even if there was a use-case, it wouldnt be delivered in the same way as it does with queries. It would rather just be subsequent payloads mirroring the same patch mechanism as it does now. The idea with defer/stream isnt how its transported, but rather asking the server to give data sooner. So the same idea should be avaliable to subscriptions, just its transport is a little undefined now. Once again will raise with committee.

What is the relationship between the incremental delivery proposal and the batching proposal?

Yeah great question. They are different. You could probably mimic the same, but you'd never get complete batched operations. Defer allows for only 1 initial frame (the query) and then every other @defer "whenever". Batching allows for n number of operations to arrive complete. So its deterministic.

But as per your example,

{
  ... on Query @defer {
    hello
  }
    ... on Query @defer {
    world
  }
}

You'd get the same vibe. Once again something we should mention in the proposal around what its differences are.

What is the relationship of defer/stream to non-nullability and code generation?

I think this would be up to individual clients to decide how to handle this. Relay marks those fields as null able during its codegen. And through a renderPolicy you can state if you want the original to be considered complete, or give me data when its avaliable.

How do errors work with incremental delivery?

Yeah haha i had this same question. Graphiql just settled on appending to the errors array. Facebook have told me that errors in the array should only be scoped to the fields present in the frame.

But yeah all great questions. @robrichard care to chime in? @benjie how can i get a committee hearing?

benjie commented 3 years ago

@maraisr If you can make it, add it to Thursday’s WG: https://github.com/graphql/graphql-wg/blob/main/agendas/2021-05-13.md

(Feel free to reach out to me on Slack/Discord/Twitter if you need guidance; I can also help via the WG agenda PR.)

hwillson commented 3 years ago

@maraisr are you still interested in talking about this on tomorrow's working group call? @brainkim and myself are planning on attending, and can submit a PR to have this added to the agenda.

maraisr commented 3 years ago

I think let me gather my thoughts a little more, lets get this added to next time. Feel like these sorts of meetings are not for time wasters, so want to be more prepared. @hwillson

That is obviously if @brainkim an wait. Do you have enough information to keep moving, or are those answers critical and youre blocked atm?

brainkim commented 3 years ago

@maraisr I don’t think I’m blocked on anything as of yet. I’m currently working off the behavior of graphql-helix, but these same issues will likely play a role in defer/stream support for Apollo’s server as well.

Feel like these sorts of meetings are not for time wasters

I think I’ve done enough research on this that we can do the meeting tomorrow, and I’d like to keep my momentum. Let me know if you think otherwise.

maraisr commented 3 years ago

Are you going to be attending the WG meet tomorrow around this issue? Perhaps I could be a fly on the wall to chime in as its being discussed, if youre going to lead the discussion? @brainkim

brainkim commented 3 years ago

@maraisr Seems like it’s a bit too short notice for the relevant actors to get together for the next meeting. I prefer async communication to be honest.

avisra commented 2 years ago

Is this ticket still being worked on?

maraisr commented 2 years ago

No, I was told to stand down.

Amnesthesia commented 2 years ago

Does this mean @defer and @stream are straight up not supported in Apollo today?

benjie commented 2 years ago

They're not yet a part of the GraphQL Draft Spec, we're still iterating on the payload formats/etc - if you're interested, you can read notes from last night's working group here: https://github.com/graphql/graphql-wg/blob/main/notes/2022/2022-05-05.md#deferstream-asynchronous-iterators-of-iterables-versus-of-items-continuation-of-last-wg-meeting-15m-yaacov

apollographql / apollo-client