whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.11k stars 328 forks source link

String representations for Fetch API objects #1575

Open sgammon opened 1 year ago

sgammon commented 1 year ago

Hello esteemed WhatWG Fetch authors,

I am a library author helping to implement fetch support downstream in the popular[^1] Axios library. For the uninitiated, Axios helps developers fetch data using various adapters (including XHR and Node's http module), and on various platforms (browsers and Node).

Now that Fetch is very broadly supported[^2] (congrats, and thank you for all your hard work!), it is seeing new support in libraries such as Axios (hi! 👋🏻). Before proceeding, I just wanted to say that the Fetch API is one of the smoothest and clearest API interfaces offered by the web, in my humble opinion; it is spreading because it is easy to use, refreshingly obvious in its behaviors and assumptions, and general enough to cover an extremely broad set of cases, from await fetch('...') to more complex scenarios with cancellation and streams.

That being said, there is exactly one place where frequent interaction with the Fetch API seems to fall short: string representations for Headers, Request, and Response. I'm writing today to see if the WhatWG can help clear this up across supporting implementations.

Paragon case: URL

When developing with URL, Request, and Response (referred to herein as ancillary Fetch API objects), the developer may often need to obtain a stringified representation, either for use elsewhere in their software, or for debugging purposes. URL objects are a bright spot of support for this: they are interchangeable with string objects for many intents and purposes within the realm of Fetch, and indeed in the browser in general:

const sample = new URL('https://github.com');  // you can create a URL from a string
console.log(sample);                           // logs the entire tree of components within the URL
console.log(`${sample}`);                      // logs the URL as a string
const string = sample.toString();              // obtains the URL in absolute string form
Screenshot 2022-12-19 at 3 52 04 AM

This behavior is consistent across all three major browser vendors:

Screenshot 2022-12-19 at 3 49 39 AM

From left to right: Firefox 107, Safari 16.2, Chrome 108.

Immediately, in the debug window, I can see the URL itself, and even browse the components of the URL. Fantastic!

Server-side runtimes also nail this:

Screenshot 2022-12-19 at 3 56 29 AM

Node 18.x on the left, Deno 1.x on the right.

Engines are remarkably consistent about URL, even across platforms and runtimes. Arguably, URL used in this way is a light value class, i.e. just a strongly-typed and validated shell around the well-formed string. This case remains sadly isolated from other Fetch API objects, though.

Problem case 1: Headers

I first encountered problems with Headers when diagnosing issues in the early Fetch adapter code. Let's try this snippet:

const headers = new Headers({x: 1});
console.log(headers);
console.log(`${headers}`);
console.log(JSON.stringify(headers));

What do we get?

Screenshot 2022-12-19 at 4 09 46 AM

Depicted: Chrome.

??

At first glance, the Headers object we just created looks empty. This is obviously not the case: the object has the header we put in it, it just isn't showing in the string representation. This can be confirmed with headers.has("x") or headers.get("x"), each of which return the expected value.

Platforms and browsers are not as consistent on this point: Chrome is arguably the worst example (cross-filed here), but everyone fails this test except Firefox, with Node/Deno getting by (barely):

Screenshot 2022-12-19 at 4 15 37 AM

Problem case 2: Request/Response

I won't bore you guys with the setup, let's take a look at how they behave across platforms:

const req = new Request('https://github.com');
console.log(req);
console.log(`${req}`);
console.log(JSON.stringify(req));
Screenshot 2022-12-19 at 4 37 13 AM

Lovely debugging experience logging it directly, and, while I understand binary exchange is the norm with HTTP/2 in broad adoption, whatever happened to Request[GET https://github.com]? The virtue of HTTP1.x was easy debuggability, and there's no reason we have to give that up in a post-binary world.

Screenshot 2022-12-19 at 4 36 53 AM

One more test, this time with Response:

const res = new Response('hello whatwg you guys rock');
console.log(res);
console.log(`${res}`);
console.log(JSON.stringify(res));
Screenshot 2022-12-19 at 4 40 03 AM

No HTTP 200 OK (51k bytes)? Okay, okay. I've made my point.

Why this is a problem

If URL behaves as a value class around a well-formed URL string (you can't create a URL object from an invalid URL string), one might assume that Headers, too, behaves as a value class around a well-formed map<!string, string[]>. This would make sense, especially since, as we all know, HTTP headers can be repeated, and so you always need a bit of ceremony around a regular Map to handle headers properly.

This assumption, though, is broken in a number of ways:

1) The developer can't JSON-stringify the headers object, as one would expect to be the case with a regular object 2) Printing the object does not reliably show the contents the developer is most likely to be interested in, as one would expect with a regular object

These are arguably cosmetic assumption breakages, but this problem gets a bit deeper. In order to diagnose or assert based on the full state of the Headers object, users are forced to iterate, carry + sync a second object, or defer use of Headers until the moment before a request is fired, which sadly neutralizes the benefits the API surface can provide (looking at you, repeated headers).

Avoiding the Headers object is lame because it's so great. Iterating carries the risk of exploding input. Carrying a second object and keeping it in sync has many pitfalls. Thus, the developer experience is surprisingly hampered by this one API incongruity.

Alternatives considered

1) Additional observable API detail. Perhaps WhatWG doesn't want to mandate disclosure of these internal object details unless and until the developer expresses an intent to obtain them: this allows implementors more latitude to privately cache or defer calculations with greater cover.

- **Mitigation:** `Headers`, `Request`/`Response` metadata, all account for a very small footprint of data. Of course, I don't think the developer expects the entire response or request body to be printed to the terminal, but I would be very surprised if the internal structure of these objects isn't optimized for heavy read access already anyway as a matter of practicality (an assumption is made here that these objects are more likely to be read-from than written-to, by and large).

- **Mitigation:** Perhaps it is possible to only emit such information when outputting these objects to the console (or, otherwise, in debugging circumstances), to avoid creating an observable API change and mitigate compatibility concerns.

2) Implementor freedom. Perhaps consensus is too high a bar for this kind of functionality to be written directly into the specification, or perhaps implementors have voiced a desire to control this aspect of how the Fetch API behaves.

- **Mitigation:** Perhaps WhatWG could consider mandating (or even recommending with a `SHOULD` or `MAY` clause) a minimum set of metadata to be displayed in the string representation for ancillary Fetch API objects. For example, the request method and URL is often sufficient to identify an in-flight request, and an HTTP status may be sufficient to identify the general disposition of a response.

3) Maybe this has been discussed. If so, my apologies, especially if a decision has been made on this topic. I was not able to find any related issues after a search on GitHub.

- **Mitigation:** My bad

Thank you in advance for consideration of this request. ✌🏻

[^1]: Publicly (on GitHub), 7.8 million projects use Axios. 103k of these are other packages which themselves constitute additional transitive addressable users. [^2]: CanIUse places fetch support at 97.05% at the time of this writing.

annevk commented 1 year ago

With the notable exception of https://console.spec.whatwg.org/ standards don't do a lot of active work around debugging.

Stringification of URL objects is there because it helps when writing code. That's why it has toJSON() as well.

Some kind of approximate stringification of Headers, Request, and Response would not I think.

toJSON() for Headers could be reasonable though. Just have to make sure we properly account for Set-Cookie when designing that.

console.log() for these objects providing a richer experience also seems quite reasonable. Perhaps the Console Standard can help with that somehow? cc @domfarolino

sgammon commented 1 year ago

Thank you @annevk for reading and responding.

Stringification of URL objects is there because it helps when writing code.

I couldn't agree more. Really, a console.log-able representation of each of these objects is the core of this request. Thank you and kudos to the Firefox team for a smooth dev experience with Headers.

toJSON() for Headers could be reasonable though

The best alternative I've found so far, but definitely open to anything better:

Object.fromEntries(headers.entries())
Screenshot 2022-12-19 at 9 12 07 PM

Under the hood in the debug logs for the Axios fetch adapter, we need to call that fromEntries(...entries()) for each log statement, which is presumably rather expensive vs. what an internal representation could do.

domfarolino commented 1 year ago

Console-wise, this is a tough one! The spec is quite vague to allow for implementation-specific formatting that is judged to be "maximally useful". Under https://console.spec.whatwg.org/#printer, see https://console.spec.whatwg.org/#generic-javascript-object-formatting and https://console.spec.whatwg.org/#optimally-useful-formatting.

So aside from filing bugs on implementations requesting a more useful custom visual representation of complicated objects, the only other alternative I can imagine is some sort of hook that console could provide to let complicated objects bypass those definitions and provide their own representation composed of primitives. Perhaps something like allowing objects to supply their own "console string builder" or "console object builder" etc., that the Console Standard can hook into and print the result of, instead of applying the vague definitions that I linked to above. That probably wouldn't be too hard to do and may well have the desired results you're looking for.

Feel free to file a bug on Console for this, but I will say that standard doesn't get a whole lot of love these days so any large contributions are more likely to materialize with help from the community!

sgammon commented 1 year ago

@domfarolino I see. thank you for the references, I'll take a look.

What about a custom [Symbol.*] which provides a function tailored to emit a better console representation? I understand stringification is already a point of API compatibility, so surely an opt-in upgrade for implementers would be best if this is fixed by spec.

Feel free to file a bug on Console for this, but I will say that standard doesn't get a whole lot of love these days so any large contributions are more likely to materialize with help from the community!

Forgive me, I'm new to standards contribution :) I would love to file a contribution for this if it is a welcome PR from WhatWG's perspective. How do I get started? I'll take a look at that spec, contribution guides, etc. Would I just fork, and file a PR change to the spec itself?

I happen to be implementing the Fetch API in a different project entirely which involves a JS runtime, so I have an opportunity to dogfood those new Console extensions myself and iterate quickly if that is at all helpful.

domfarolino commented 1 year ago

What about a custom [Symbol.*] which provides a function tailored to emit a better console representation? I understand stringification is already a point of API compatibility, so surely an opt-in upgrade for implementers would be best if this is fixed by spec.

Whether to pursue the "internal Console Standard hook" route vs something like a new web-exposed well-known Symbol name route depends on how important it is to have the pretty-printable values exposed to script. For most of these objects (perhaps except Headers?) Anne seemed to be leaning more towards the idea of a Console Standard hook so I guess we should figure that out. I don't know what would be best for these Fetch objects, or if Anne has strong opinions either way. That seems like the first thing to resolve though.

A Console-only hook would be pretty easy, but the implementation is vague and might be hard for vendors to prioritize, and you don't get any interesting values exposed to script. A web platform-exposed hook would give you more information, and could allow you to print things without requiring modifications to the implementation-specific Console logging, but it is a larger addition to the platform, especially with Symbols and all.

Thoughts @annevk ?

annevk commented 1 year ago

I think as far as console.log() goes we could perhaps stress stronger that implementations are encouraged to provide a lot of additional context about platform objects being logged and maybe give some examples in the console.log() specification as to what that might look like.

However, this really only applies to the examples in OP where the object is directly passed and not stringified first or some such.

And on reflection for those cases what browsers show for Request and Response seems reasonable, but it could be tailored to be even better.

For Headers I can see making the case that it should show more of the internal structure.

I suspect we cannot actually make normative requirements about what the representation will look like so hooks of any kind are probably not needed here.