whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.1k stars 325 forks source link

Proposal: Easier request/response rewrites #671

Open kentonv opened 6 years ago

kentonv commented 6 years ago

Hi all,

For those that don't know, I'm the tech lead for Cloudflare Workers, which implements the Service Workers API (but runs code on Cloudflare's servers rather than in the browser).

In general the Service Workers and Fetch APIs have been very good for us (certainly better than what we would have ended up with had we invented our own). But, based on user feedback we are finding some pain points. This is the first of a few proposals I'll be making to solve these issues.

/cc @harrishancock who does most of the API implementation work on our team.

Problem statement

Today, making minor "rewrites" to a Request or Response (e.g. starting from an existing object and modifying just one property or header) is not very ergonomic. I'm not sure if this is common in browser Service Workers, but it is probably the dominant use case for Cloudflare Workers, so we're very interested in making it better.

For example, consider the case where I want to change the hostname, change the redirect mode to "follow" (to resolve redirects server-side), and add a header. This might look like:

addEventListener("fetch", event => {
  let request = event.request

  // Change URL.
  let url = new URL(event.request.url)
  url.hostname = "example.com"
  request = new Request(url, request)

  // Change the redirect mode.
  request = new Request(request, { redirect: "follow" })

  // Add a header.
  request.headers.add("X-Example", "foo")

  event.respondWith(fetch(request))
})

Notice how the best way to change each property is wildly different!

When it comes to the Response type, we have a bigger problem: you cannot pass an existing response object as the first parameter to Response's constructor, the way you can do with requests. (If you try to do so, the response object will be stringified as [object Response] and that will become the new response's body.) So, if you want to modify the status code of a Response:

addEventListener("fetch", event => {
  if (new URL(event.request.url).pathname.startsWith("/hidden/")) {
    // Mask our hidden directory by faking 404.
    event.respondWith(fetch("/404-page.html")
        .then(response => {
      // 404-page.html returns with status 200. Give it status 404.
      return new Response(response.body, {
        status: 404,
        statusText: "Not Found",
        headers: response.headers
      })
    }))
  }
})

This is bad, because if Response and ResponseInit are ever extended with a new field, that field will be inadvertently dropped during the rewrite. (We commonly see people doing Request rewrites this way, too, where it's an even bigger issue as RequestInit has quite a few fields that tend to be forgotten.)

Proposal

Let's make Request's constructor be the One True Way to rewrite requests. To that end:

Similarly, let's fix Response to use the same rewrite idiom:

annevk commented 6 years ago

There has been https://github.com/whatwg/fetch/issues/245#issuecomment-196866516 in the past.

The main problem with mutations is keeping all the security boundaries non-broken, but it seems this circumvents that largely by always starting with a fresh Request/Response object, just filled in from another one.

domenic commented 6 years ago

Maybe instead of making the constructors support two purposes, their usual one and the copying use case of the OP, it'd be better to add a dedicated static factory method.

I don't like duplicating headers's functionality on Request; right now those objects are nice and loosely coupled, and introducing indirection methods that couple them together seems like a bad idea when the goal is to just save typing a ..

kentonv commented 6 years ago

Maybe instead of making the constructors support two purposes, their usual one and the copying use case of the OP, it'd be better to add a dedicated static factory method.

I'm fine with that, but I note that the Request constructor already appears to be designed to support this use case, by passing the request as the first parameter. It just seems to be missing a couple pieces. So since that's already there, I feel like filling it out is cleaner than creating another thing.

I don't like duplicating headers's functionality on Request ... seems like a bad idea when the goal is to just save typing a ..

Since headers are mutable on newly-constructed objects, I can live with telling people to modify them there.

But it's a lot more than just typing a .. The way you rewrite headers is imperative, while the way you rewrite anything else is declarative. Those are two very different programming styles. Mixing them creates a fair amount of cognitive overhead. It's not intuitive to developers -- especially inexperiented ones -- why headers should work differently from everything else.

It doesn't help that when people try to "fix" their code to be consistent, it leads to weird failure modes. For example, say you have:

addEventListener("fetch", event => {
  let request = new Request(event.request, {redirect: "follow"});
  request.headers.set("X-Example", "foo");
  event.respondWIth(fetch(request));
})

You decide you don't like the style mixing, so you try to rewrite it to be all-declarative:

addEventListener("fetch", event => {
  let request = new Request(event.request, {
    redirect: "follow",
    headers: { "X-Example": "foo" }
  });
  event.respondWIth(fetch(request));
})

But it turns out this completely wipes out all the existing headers and replaces them. Whoops! But it seems to work because most request headers are not essential for GETs. You just silently lose things like caching (missing If-Modified-Since, etc.).

Or maybe you try to go the other way -- all-imperative:

addEventListener("fetch", event => {
  let request = new Request(event.request);
  request.redirect = "follow";
  request.headers.set("X-Example", "foo");
  event.respondWIth(fetch(request));
})

This produces no error, but the redirect mode change is silently ignored. (Well, it throws an exception in strict mode, but few people use strict mode, unfortunately.)

This can be really hard for inexperienced developers to wrap their heads around. And with CF Workers, we get a lot of very inexperienced developers trying to write code. (I suspect our developers are on average much less experienced then Service Worker develpoers.)

So my hope is that we can develop one common syntax for rewrites, with minimal warts, and get everyone to use that.

Alternative Proposal

Maybe could introduce a new rewrite() method:

let request = event.request.rewrite({
  url: "https://example.com/foo",
  redirect: "follow",
  headers: { "X-Example": "foo" }
});

This mostly has the obivous semantics, except when it comes to the headers sub-object. Whatever you provide for headers gets forwarded on to Headers.rewrite(). So the new headers above would be computed as event.request.headers.rewrite({"X-Example": "foo"}). We would then define Headers.rewrite() as a merge, not a replacement. It's equivalent to making a copy of the Headers object then calling set() for each key/value pair -- or if the value is null, it could be treated like calling remove().

I think this keeps Request and Headers loosely-coupled, since the semantics of header rewrites are defined by the Headers class. It has the down side that there's no obvious way to specify append() instead of set(), but that's rarely actually needed so maybe it doesn't matter.

Thoughts?

kentonv commented 6 years ago

@annevk

The main problem with mutations is keeping all the security boundaries non-broken, but it seems this circumvents that largely by always starting with a fresh Request/Response object, just filled in from another one.

Indeed, I think what I'm proposing is just syntax sugar and shouldn't have security implications. That said, these kinds of security issues don't really affect CF Workers so I could be missing something.

(CORS is a complete non-issue for us since a CF Worker obviously cannot hijack other origins' cookies nor reach behind-the-firewall services.)

annevk added "the needs implementer interest" label

How does this work given that I am an implementer? :) It's not necessarily important to me that browsers prioritize implementing these proposals. I just want to avoid a situation where we've implemented interfaces that browsers outright refuse to support for some reason, or a situation where browsers implement something incompatible.

annevk commented 6 years ago

We basically want at least two browsers to also support it to ensure long term success (or at least be more certain of it). https://whatwg.org/working-mode#changes goes into it to some extent.

kentonv commented 6 years ago

Here's a developer on Stack Overflow who ran into this problem: https://stackoverflow.com/questions/51485738/how-can-i-make-a-cloudflare-worker-which-overwrites-a-response-status-code-but-p/51488911

(Probably many others have written similar code but not recognized that there's a problem, or not bothered to ask about it.)

othermaciej commented 6 years ago

Speaking as a WebKit person: Request rewriting seems like a common use case for anyone using fetch with Service Workers, not just in the special CloudFlare environment. Given that, it seems worthwhile to make it easier and more convenient, if that can be done without making the API overly complex. I expect WebKit would be happy to implement a reasonable API along these lines.

I don't have much of an opinion on what specific syntax to use for it.

It's worth noting that sometimes APIs that have little active browser support, but no hard opposition either, sometimes still fail. Media sync API is one past example. So it would be good to get other implementers on the record.

kentonv commented 5 years ago

FWIW here's several examples of me answering StackOverflow questions where people didn't understand the current API:

https://stackoverflow.com/a/55958061/2686899

https://stackoverflow.com/a/54334947/2686899

https://stackoverflow.com/a/54281243/2686899

At some point when we have time (haha), we'll probably go ahead and introduce a better API for this, likely along the lines of my "alternative proposal" in my second comment in this thread. I'd love to work with the browser vendors on this if there's interest, but we're also happy to do it as a non-standard extension if you don't feel it matters for the browser.

jimmywarting commented 4 years ago

thinking this would be useful also, but in my current case it was not about rewriting but instead construct and mock a response object...

var res = new Response('', { url: 'https://example.com' })
console.log(res.url) // ""
res.url = 'https://example.com'
console.log(res.url) // ""

Why don't this work? I basically have to redefine or use a Proxy:

Object.defineProperty(res, 'url', { value: 'https://example.com' })
console.log(res.url) // "https://example.com"