Accept headers in spec cause confusion

yoavweiss commented 8 years ago

During recent Chrome patch discussions, concerns were raised that Chrome's Accept headers are not spec compliant, as they include MIME types supported by Chrome but not supported universally. While that's not the case, it required explanations as to why our current implementation is as it should be. I think a note in the spec would go a long way to prevent similar cases in the future.

This was raised in the past, but was concluded as unnecessary. I agree with past arguments that current spec language permits UAs to extend the Accept header, but stating it explicitly, saying that user agents may (or even should) extend the spec's values with MIME types they support, would be helpful.

/cc @igrigorik

mnot commented 8 years ago

Relevant part of the spec is 5 Fetching.

See also this mozilla bug, which moved them to */* for images. It gives some rationale for how to manage these values.

annevk commented 8 years ago

Yeah, some of that bug should probably become a "Background reading" section. No objections from me to further clarifying this.

And I guess we should update image to be */* since that's now more common.

hober commented 6 years ago

Updating image to be */* would also solve the "what to do for <img src="foo.mp4">" problem.

cc @jernoble @grorg

domenic commented 6 years ago

To me the right framework seems to be mandating */* but then allowing browsers to add others on the end if they want? It seems like we shouldn't lock down the image formats the web supports, in general... Or should we try to converge on that too, these days?

annevk commented 6 years ago

Ideally they are added just like other features I think, but IPR complicates media formats tremendously. So maybe that's a reasonable way to go.

snuggs commented 6 years ago

To me the right framework seems to be mandating / but then allowing browsers to add others on the end if they want? @domenic

Agreed! I do remember seeing on the chromium bug someone stating "the spec shouldn't define Accept." Not certain I agree but I do feel */* with the ability to extend is a win, win.

Very Postel's Law. Worked for TCP :-)

@annevk can you clarify "they"? Not sure of context.

mnot commented 6 years ago

FWIW - */* is always implied; they only reason you'd want to include it explicitly is if you want to say */*;q=0 -- i.e., "don't send me anything else."

snuggs commented 6 years ago

Thanks for the clarification @mnot! 🙏

Also been meaning to ask for a while now but didn't know the place as the context is spread between WHATWG & W3C and all places in between.

Regards to preloading (therefore fetch destinations) Am curious of Accept for the document destination. Although I'm not versed on how fetch works with frame documents. I do realize there is a bug in chrome for link as=document not working. I noticed as=script and as=fetch all use */*. However as=font and as=style send Accept similar to browser defaults. would document send the default UA Accept: text/html,...,*/*;q=0.9 or */*? I think it should be the former of the two.

This has been a concern for me in keeping MDN docs up to date as am noticing documentation is already starting to send out (potentially) incorrect information and there are links from the documentation back to here for "a bit more detail".

preload has other advantages too. Using as to specify the type of content to be preloaded allows the browser to:

Prioritize resource loading more accurately. Match future requests, reusing the same resource if appropriate. Apply the correct content security policy to the resource. Set the correct Accept request headers for it.

"correct" is the word that concerns me and who defines it.

Please forgive me if this is a vendor specific topic similar to as=document but I feel some vendor specific (and relative to myself, documentation) topics we can aid a touch with proactive avoidance AKA a touch more granular clarification within the spec.

Thanks in advance to whoever has information! I feel this is a relative place to ask. Please direct me elsewhere if this is not the place. Wanted to spare yet another issue. This as been a mystery to me for months since the feature is broken at the vendor I use to test.

yoavweiss commented 6 years ago

@snuggs - <link rel=preload as=document> is not currently supported in Chromium, as supporting it is more likely in the current prioritization scheme is more likely to be a footgun than anything else. But please comment on the issue you linked to if you have feedback on that.

"correct" in the documentation most likely refers to the fact that the browser is aware of the Request.destination for the resource it is fetching and therefore can adapt the Accept value to that.

If you think some browsers are not sending the right Accept headers for preloads vs. regular fetches of the same request destinations, implementation bugs are probably in order.

snuggs commented 6 years ago

Hmmm @yoavweiss not certain i'm versed well enough on this topic to even give feedback. Still fairly new here. I do realize we push back on things conneg-y. However I feel much of the pain (and failure) in the past was much to do with conversations like this not being had during implementations (or being had behind respective vendor closed mail threads). Much has changed since then and specs (usually) are sound. It's the unclear assumption during implementation is the loading of the magazine in said footgun IMHO. The WPT does help with these assumptions today but I could not find relevant tests there (i've checked).

Perhaps this is less of a deal than I think this is. You all would know better. However during this transition to a more collaborative web I'm learning (from @domenic) That's just the way we've done things around here. Is ok to question from time to time. As long as the question doesn't break the web of course. 😄 That's the part i'm still figuring out.

Is this even a concern @yoavweiss or are there better fish to fry?

Thanks for the feedback and your time.

yoavweiss commented 6 years ago

@snuggs - apologies if my reply came across as overly-negative. That wasn't my intention.

I do realize we push back on things conneg-y.

I don't know that this is still true in general. It certainly isn't true for my specific case. And we have been pushing improved content negotiation solutions in the last few years.

Is this even a concern @yoavweiss or are there better fish to fry?

I'm not sure what you're referring to by "this". IMO, "correct" and useful Accept headers are certainly something worth investing time and thought in. I don't think we need to specify what the value of those headers is for all request destinations, because support varies between different browsers and implementations. Therefore I believe the spec needs to give implementations the liberty (and guidance) to do what's right. Unlike today, where implementations need to ignore a SHOULD in order to truly advertise their support for non-universal file formats.

Regarding fetch vs.script destinations, do you have a use case in mind where differentiating the two would be helpful? What are the Accept header values for them when not considering preload?

snuggs commented 6 years ago

And we have been pushing improved content negotiation solutions in the last few years. @yoavweiss

I feel a ton better about my thoughts on the matter. Wish I would have seen that years ago.

Regarding fetch vs.script destinations, do you have a use case in mind where differentiating the two would be helpful? What are the Accept header values for them when not considering preload?

To be clear I have no issue with fetch & script sending */* all across the board. IIRC @domenic mentioned a needle in some haystack here about mime types only being needed to discern whether or not to block a script. I think related to image vs '*' and being able to be executed. As I think more there are a couple congruent thoughts.

What is the value of Accept on a per-case requested resource destination basis which I am much clearer on from this issue and the places you all are linking me to. (wasn't my initial concern but interested now).
Can/should the Accept be the same as the browser does now. (elaboration below).

Navigating to a url seems to give a consistent Accept Based on this MDN documentation. Would the correct term we use be "Browsing context"? I noticed that somewhere before. Also feel the term in the document "default" is really related to */* perhaps? Am understanding using the address bar/ <a href=...> (browsing context(?)) is the "default" for end users. IMHO better stated as "most commonly used by humans but browsers aren't humans.". Also states

Browsers add the */*. Going off our discussion the inverse would be true. The spec defaults the browser to */* and each can add at their discretion. Would personally love to have document request destination send more than */*.

My use case is am currently preloading some html. Due to as=document not working (which seems more semantically correct than fetch and at the time months ago didn't know this was a personal problem of Chrome's) we began using <link rel=preload as=fetch> to circumvent the bug. We do some light logging internally via Accept and although technically an html file and javascript file are indeed "text" (as @annevk pointed out to me earlier today). I do feel if there is a chance we can take to have as=document (or document destination) be "browsing context" or whatever makes that default Accept. I think this is the place to at least start the conversation. I do not know how this affects XMLHTTPRequest tho.

P.S. note to self Use this, that, it, etc. less on Github. This (pun intended) is the second time someone mentioned to me. :-)

snuggs commented 6 years ago

@yoavweiss I think I found what/where i've been looking for:

https://fetch.spec.whatwg.org/#fetching

Step 3.3

... Otherwise, a user agent should set value to the first matching statement, if any, switching on request’s destination:

"image" image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5 "style" text/css,*/*;q=0.1

Is this list not exhaustive? Should the spec include document etc? Is this where we run into issues when Firefox changes image to */*?

yoavweiss commented 6 years ago

document is covered by the previous clause:

If request is a navigation request, a user agent should set value to text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8.

But otherwise, the list is not exhaustive. This issue is about making sure that this list enables user agents to maintain useful Accept header values that match their supported formats, while staying spec compliant.

snuggs commented 6 years ago

@yoavweiss Just an update from a question I asked you earlier about document destination related to Chrome. i don't know if "Main Frame" means browsing context but I do see where the document destination would be set which was addressed in this patch. Would be able to test if as=document worked for Chrome. But alas... Thanks for the input and apologize if diverted the issue. I believed the topic was relative from searching issues. And I know now where to track these topics and bugs.

Thanks for your help.

annevk commented 5 years ago

So I looked at this again, for the img element:

Firefox: image/webp,*/*
Safari: image/png,image/svg+xml,image/*;q=0.8,video/*;q=0.8,*/*;q=0.5
Chrome: image/webp,image/apng,image/*,*/*;q=0.8

And Fetch still has image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5, matching nobody, though it's a "should", not a "must".

I guess we can say something about using */* when in doubt and something matching the Accept production otherwise, unless there's a desire from user agents to align here.

yoavweiss commented 5 years ago

As long as different user agents support different file formats it doesn't make sense to align here. For example, I think that nowadays Firefox and Chrome could align with each other, but they shouldn't align with Safari as long as Safari doesn't support webp and does support videos in image contexts.

What Fetch could do is provide general guidance about how UAs should construct their Accept strings (e.g. have the "rarest" supported format first, collapse commonly supported values into "/", etc), but I don't think the spec should dictate any specific values.

That is, unless we want to define precise processing ("if webp is supported, apng is supported, and video formats in image contexts are not supported, return ...." kind of definition).

Does that make sense?

annevk commented 5 years ago

I'm not sure, Chrome expanding the Accept header for navigations creates a mess such as https://bugzilla.mozilla.org/show_bug.cgi?id=1544231. And ideally addition of formats goes through a similar process as adding new features, at which point a decision about the new header could be made as well.

wolfbeast commented 5 years ago

From #877

This should be adapted to give user agents the possibility to include image formats that might not be universally accepted but preferred, e.g. image/webp or image/apng or image/jxr, and to exclude formats that might not be supported or preferred (e.g. excluding svg support for agents on systems where vector graphics are disproportionately expensive to process), to allow server-driven conneg to happen with servers that support multiple other formats.

As it stands, the spec doesn't allow for this kind of required flexibility, since it creates the expectation that all user agent should align, and users agents would not be spec compliant if they indicate their capabilities to servers which is in many situations desired. What's the point of the Accept headers if you're not actually allowed to use them for what they are for?

yoavweiss commented 5 years ago

I'm not sure, Chrome expanding the Accept header for navigations creates a mess such as https://bugzilla.mozilla.org/show_bug.cgi?id=1544231.

Speaking only for myself (haven't asked around yet), I think it may make sense to define what Accept headers values are given a set of supported formats. The main risk is that the definition may get verbose/complex as the number of permutations gets higher.

At the same time, I wouldn't want Safari to "align" their Accept values before they actually support the underlying formats.

P.S. On the cited issue, I'd argue that Firefox should add webp support to its navigation Accept values, to enable negotiation of image documents.

annevk commented 5 years ago

Negotiation of image documents is not what it is being used for. The issues Firefox hits are because sites are using the Accept header to figure out what HTML to generate, which is not what the feature is for.

wolfbeast commented 5 years ago

That's not a problem with the spec. Mandating certain fixed strings in the spec to passive-aggressively force website owners to use other methods for browser detection (which is what you're saying, IIUC) is the wrong approach to this problem, especially if you're breaking valid content negotiation mechanisms in the process.

whatwg / fetch

Accept headers in spec cause confusion #274

Step 3.3