blitz-js / blitz

⚡️ The Missing Fullstack Toolkit for Next.js
https://Blitzjs.com
MIT License
13.62k stars 795 forks source link

RPC Specification #85

Closed flybayer closed 3 years ago

flybayer commented 4 years ago

This issue is for defining the RPC specification used for Blitz queries and mutations. Each query and mutation can be used by third parties, so the specification must be clear and documented publicly.

This initial specification is my first pass. I haven't thought real deeply about this, so please help me refine it!

Specification

This specification is used for both queries and mutations. From an HTTP perspective, there is no difference between a query and mutation because they are both function calls.

Details

HEAD

This will be used to warm the lambda. For example, when a component renders on the client that uses a mutation, it will issue a HEAD request to the mutation endpoint immediately on page load so that there's a much higher chance the actual mutation request will hit a warm lambda.

POST

Used to actually execute the RPC

Example:

// Valid request
{
  params: {id: 1},
}

// Success response
{
  result: {id: 1, name: "Brandon"},
  error: null
}

Responses

HTTP Response Code Response result value Response error value
HEAD 200 - -
Valid request 200 - -
Invalid request, not JSON 400 null {message: "Request body is not valid JSON"}
Invalid request, missing params 400 null {message: "Request body is missing the 'params' key"}
Execution success 200 Exact value returned from query/mutation function null
Execution error 200 null TBD
All other HTTP verbs, like GET 404 - -

Versioning

Any time you have client-server communication, there is a very real risk of the client & server code becoming out of sync and causing an error. For example, let's say you have a mutation function whose input is a single object. You decide to change the mutation input to be a single array. You make the change and deploy it. Your server code is now updated, but all your users still have the old client code running in the browser. So it's likely you now have client code sending an object to your mutation which is going to error because it's expecting an array.

Now one obvious solution is to only make changes that are backwards compatible, but sometimes you can't do this.

The other solution is to track the version of your client code and your server code. If there is a version mismatch, you can display a friendly message to the user, telling them they need to reload the webpage to continue.

Blitz is well positioned to provide this, so let's do.

Notes

merelinguist commented 4 years ago

@flybayer Do you want error to be an object or array? Just thinking that e.g. for validation errors, there may be multiple problems to report.

flybayer commented 4 years ago

@merelinguist I think the top level error field should be an object. Then we can have sub fields that are arrays.

I'm headed to bed for the night, but tomorrow I plan to create the other issue on error handling. I'll think about this more then :)

camilo86 commented 4 years ago

In case of form validation, maybe the error object should contain an inner object with detail errors about each field:

// error object
...
"errors": {
    "fieldNameA": ["Error message 1", "Error message 2"],
    "fieldNameB": ["Error message 1"]
}

Or would something like this follow under the developer's responsibility to do on their own?

merelinguist commented 4 years ago

I’d expect it to look a bit more like:

errors: [
  {
    path: “email”,
    message: “invalid email”
  }
]
flybayer commented 4 years ago

Let’s discuss the actual error format in the separate issue on error handing. I’m working on it at the moment and will link here when I’m finished.

For the purposes of this specification, error should be an object for maximum flexibility. But the spec shouldn’t dictate anything beyond that.

flybayer commented 4 years ago

I just created the issue on error handling! #87

flybayer commented 4 years ago

@ryardley what do you think of this?

ryardley commented 4 years ago

Looks good. I guess one concern I see here would be how do we handle security concerns such as CSRF and client authentication in the context of this? This is probably it's own spec and will be handled in an RFC but it will affect this.

flybayer commented 4 years ago

@ryardley yeah, I think all of those will be handled via cookies and headers, not in the actual JSON body.

merelinguist commented 4 years ago

Love the sound of the versioning, by the way!

HaNdTriX commented 4 years ago

First of all thanks for the awesome work you have been doing 🙏. I am really impressed by how fast things are evolving.

From my point of view handling RPC queries and mutations exactly the same might be a mistake. Especially when it comes to caching there are quite a lot of differences between queries and mutations. The most important difference is that POST requests are not idempotent. Those requests can not be cached. Which makes them a perfect candidate for mutations.

Nevertheless queries are idempotent. They may be cached so the HTTP GET verb is a good candidate for them.

That's why I am proposing:

GraphQL background: Many graphql implementations are initially using POST requests for mutations and queries. This is due to the fact, that a serialized graphql query might exceed the url limit of modern browsers. As a solution persisted-queries arose and allowed GET requests for queries. I don't think this restriction applies to queries in the blitz.js context so I would go for the more performant GET approach.

flybayer commented 4 years ago

@HaNdTriX thank you for bringing this up!

You are right, it would be nice to benefit from HTTP caching.

So Blitz queries can also have very large serialized input arguments, but it'll always be less than GraphQL since you are only sending arguments instead of arguments + arbitrary GQL query structure. And because you are only making one query at a time instead of a huge document of queries.

Do you happen to know what the practical size limit is for url length? We need to determine exactly what argument size we could support with GET.

HaNdTriX commented 4 years ago

It depends. Based on your CDNs and the Browsers that are being used this limit may vary.

As a rule of thumb a size limit of 2.048 Chars for the full URL string used in the Address bar should be safe. On Ajax level this limit may be up to 2.5x higher based on some tests.

Browser     Address bar   document.location
                          or anchor tag
------------------------------------------
Chrome          32779           >64k
Android          8192           >64k
Firefox          >64k           >64k
Safari           >64k           >64k
IE11             2047           5120
Edge 16          2047          10240

source

Since these limits always target the full URL length it is difficult the determine a limit from the Framework side. Blitz doesn't and shouldn't know about the origin it is running on.

URL = origin + pathname + search + hash

In addition to that, the size of the params is a dynamic variable that changes based on the runtime variation. Warning the user is already done by the Browser/CDN and will be too late anyway.

If we want to be really save we "could" check the size of our url at runtime and switch back to a non cacheable POST request if we exceed a specific limit.

HaNdTriX commented 4 years ago

While we are talking about limits. Please also think about the bodyparser limit of POST requests as well. Next.js uses a bodyparser middleware under the hood that may be configured as follows:

// pages/api/queries/somefile.js
export const config = {
  api: {
    bodyParser: {
      sizeLimit: '1mb',
    },
  },
}

This limit should not be too high because otherwise it opens the door for all kinds of attacks. If it is too low the request might be aborted. Thinking of a way for the user to configure it may be useful.

flybayer commented 4 years ago

@HaNdTriX wow, that's super helpful! Thank you!

It sounds like it will be good to switch to GET and then fallback to POST if it's past 2k chars or so.

Also good point on the bodyparser size. We can definitely add a configuration item for that.

Do you think we should use a different size limit than default for queries and/or mutations?

HaNdTriX commented 4 years ago

It sounds like it will be good to switch to GET and then fallback to POST if it's past 2k chars or so.

Or just retry with POST on 414 URI Too Long.

Also good point on the body-parser size. We can definitely add a configuration item for that.

👍

Do you think we should use a different size limit than default for queries and/or mutations?

When it comes to body parser limits

I guess Next.js chose a more risky default since Next.js application primary run on lambdas. The damage is not so high here if someone tries a body-parser attack. On a traditional singe process node server (e.g. express) a body-parser attack can bring down your entire process.

If an endpoint just listens to GET request we don't need a body-parser at all. If we want to support a POST fallback I would go with the defaults, for now (because I don't know better).

aem commented 4 years ago

Copying from my comment in Slack:

My company had to disable HTTP caching globally because it caused so many consistency problems, in my experience it causes more problems than it solves. my recommendation there would be that if we want to enable it it should be opt-in in the blitz config.

It’s definitely a problem that is more relevant at scale than it is in a small app, but in a high-traffic multi-user app it’s very possible for cached requests to cause serious bugs

HaNdTriX commented 4 years ago

@aem thanks for your Feedback 🙏. I believe caching always depends on the usecase and should be evaluated per route. Disabling it on a global level seems quite extreme to me.

When we are looking on the latest developments we can see that the community is working hard on delivering static performance. The raise of the JAMstack has shown that. Next.js apps are static by default nowadays and might even allow things like iSSG (incremental static site generation) in the future.

When you take a deeper look on the current implementations around iSSG you quickly realize, that this technique heavily bets on the RFC5861 a HTTP Cache-Control Extensions for Stale Content allowing the developer to push data to the edge. So it all boils down to one simple cache header (Cache-Control: max-age=600, stale-while-revalidate=30 or expressed in a Next.js API revalidate).

So Next.js will add it's revalidation api to all static pages and I believe blitz might want to follow this path. @aem I totally agree that we should never cache anything without explicit configuration. Nevertheless we should allow this behaviour and not permit it.

For me the topic of caching has changed in the last year. It's is not anymore so much about saving some resources the users browser. Instead I often try to move my "app" closer to the user by moving my entire code & data to the edge. This is done by caching on CDN level (RFC5861).

Caching is our tool to move data to the edge

Thats why I believe allowing queries to be cacheable is a good thing 😊.

Links

flybayer commented 4 years ago

Good thoughts @HaNdTriX!

A clarification: Blitz uses Next.js under the hood, so Blitz already has static page generation with getStaticProps and iSSG by using the unstable_revalidate config item from Next.js. This happens regardless of whether queries are cached or not.

Pages that can be cached on the CDN should use getStaticProps instead of useQuery with a cached query response. Make sense?

Most Blitz queries will be private data behind a log in, and so these queries can't be cached on the CDN. But these queries could be cached in the browser. However, we are already caching queries in the browser with our useQuery component (built on react-query).

So maybe there's not a strong case for GET on queries?

aem commented 4 years ago

Appreciate the thoughtful answers here!

As @flybayer said, the actual static pages will always be cached, and I think we both fully agree with you that performance there is non-negotiable!

As far as the RPC requests, responses will often be different depending on who is making the requests. In my opinion (and experience, as I mentioned above), HTTP caching for those data queries is incredibly complex and error prone to ensure that you don't leak user data via a bad cache. Instead using a combination of ServiceWorkers and localStorage or sessionStorage to cache requests yields higher data security (no chance of user A seeing user B's data), and much better performance for the host app since the request loop is fully bypassed.