WebAssembly / wasi-http

Other
152 stars 24 forks source link

Fields/Headers take a list of Vec<(String, String)> to create, but probably should take Vec<(String, List<String>)> #12

Open brendandburns opened 1 year ago

brendandburns commented 1 year ago

The new-entries function takes a list of (String, String) but the fields-get function returns a Vec(String)

Headers can have multiple values per the HTTP spec, so I think that new-entries should probably take Vec<(String, List<String>)>

lukewagner commented 1 year ago

I wondered about this as well. One reason why perhaps not is: in the underlying protocols, iiuc, the fields are just a sequence of key/value pairs and thus if we cluster them by key, we're losing the protocol order which I could hypothetically imagine affecting some applications in corner cases. Also, for at least some common HTTP libraries (1,2), list-of-string-pairs is offered as the lower-level ground truth. Also, when you want to perform a higher-level mapping of a key to 0..N values, then the method you want is fields-get (allowing the host impl to use a backing hash-table (e.g.)). Thus, I think list<tuple<string,list<string>>> sits in a sort of no-man's land between the lower-level sequence (that I think is commonly used in the representation) and the higher-level map.

But that's just my understanding based on a partial knowledge of this space, so I'm happy to hear other thoughts too.

pimterry commented 1 year ago

the fields are just a sequence of key/value pairs and thus if we cluster them by key, we're losing the protocol order which I could hypothetically imagine affecting some applications in corner cases

This is notable because quite a few services (notably Cloudflare) use the details of this header ordering & header key case (Connection vs connection) for HTTP fingerprinting to detect types of client, and in many configurations they'll automatically block HTTP traffic based on that alone as a bot detection heuristic. Affects HTTP proxies, web crawlers, scraping tools, uncommon web browser implementations, etc etc - big chunks of the web become inaccessible via HTTP if you can't precisely control raw header order & casing.

Not something that most normal HTTP API clients need to care about much, but imo it's definitely worth having low-level tuple<string, string>[] APIs available everywhere to support these cases (for example Node.js has a headers map and rawHeaders array available on all HTTP messages, and writeHead(status, headers) accepts headers in both KV map & string array formats).

brendandburns commented 1 year ago

@lukewagner my biggest feedback is that there is inconsistency between the way that you create a new headers object (string, string) and the value that is returned when you call get (vec). And then of course there's also an 'append' method.

I don't feel strongly, multi-header is definitely an edge case, but we should be consistent at least.

lukewagner commented 1 year ago

Sorry, I think I'm missing your point. As I'm reading it, new-fields takes a list<tuple<string,string>> and fields-entries returns the same type, so we're symmetric in the case of bulk construction/consumption. The other methods are for finer-grained queries/mutations that don't require touching the complete list of fields.

pchickey commented 1 year ago

I believe his argument is that new-fields should take a list<tuple<string,list<string>>>, a design change I am also in favor of

lukewagner commented 1 year ago

Wouldn't that also lead to the same loss in control over ordering as if fields-entries returned a list<tuple<string,list<string>>>? There's also a related question of whether the caller of new-fields would be allowed to pass the same key multiple times in the outer list, and whether that's enforced and, if not, what happens (including in chaining scenarios). (As a side note: it's unfortunate we don't have a map type, which is complicated for a lot of the same reasons.)

brendandburns commented 1 year ago

I took a random look at a common HTTP library (https://square.github.io/okhttp/4.x/okhttp/okhttp3/-headers/) and it looks like the way they solved this is that they have both a field-entry (which returns the first value) field-entries (which returns the list of values) as a way of making the API more explicit.

lukewagner commented 1 year ago

Starting with what we have proposed now:

I'm guessing you're not asking for a variation on the first one that just returns the first value (hard to imagine that one being useful), but, rather, a variation on the second that just returns the first value for a given name (or none), so perhaps that suggests a name like:

PiotrSikora commented 1 year ago

I took a random look at a common HTTP library (https://square.github.io/okhttp/4.x/okhttp/okhttp3/-headers/) and it looks like the way they solved this is that they have both a field-entry (which returns the first value) field-entries (which returns the list of values) as a way of making the API more explicit.

Usually, such API is a result of fixing the issue of returning only the first value, while retaining backward compatibility at the API level, i.e. it's not a proper solution for a new API.

brendandburns commented 1 year ago

I think that the fundamental issue is that most people's model of HTTP headers is a hash-map. This isn't correct, but it is the general mental model.

If we're going to push people towards the "list of tuples" mental model, then we need to be consistent, so I would prefer

I think the challenge with fields-get: func(fields: fields, name: string) -> list<string> paints a picture where it really looks like the structure is a hashmap.

Additionally given that the common (by far) case is a single (string, string) tuple for each header name, returning a list makes the programmer do extra work that is dissonant.

brendandburns commented 1 year ago

fwiw golang also separates Header::Get (get the first value) from Header::Values (get all values) https://pkg.go.dev/net/http#Header

On the other hand, dotnet only returns the list: https://learn.microsoft.com/en-us/dotnet/api/system.net.http.headers.httpheaders.getvalues?view=net-8.0#system-net-http-headers-httpheaders-getvalues(system-string)

So maybe it doesn't matter :)

dicej commented 1 year ago

Seems to me that new-fields should mirror fields-entries and accept a list<tuple<string,list<u8>>>.

lukewagner commented 1 year ago

Ah yes, I think this was an omission in #29. I'll file a follow-up fix.

lukewagner commented 1 year ago

42