Closed mnot closed 4 years ago
fielding@gbiv.com commented:
From 2113:
(editorial) make section on language tags more concise, since we already delegate the definition to RFC5646; partly addresses #426
fielding@gbiv.com commented:
From 2114:
(editorial) improve description of 300 and 406 in reactive negotiation; partly addresses #426
fielding@gbiv.com commented:
From 2115:
(editorial) product tokens listed in decreasng order; partly addresses #426
fielding@gbiv.com commented:
3.1. Representation Metadata
Expires | Section 7.3 of [Part6] |
If "Expires" is considered "representation metadata", then it seems like "ETag" and "Last-Modified" should be as well. But I think it would make more sense to just remove "Expires" from the list; it's clearly the odd man out here.
Moved to control data in 2092.
3.1.1.2. Character Encodings (charset)
Implementers need to be aware of IETF character set requirements [RFC3629] [RFC2277].
It's not clear what requirements this is referring to; RFC 2277 places requirements on protocol authors, not on implementors, and RFC 3629 is just the definition of UTF-8. If the requirement is "implementations MUST support UTF-8" then we should say that.
Removed in 1975.
3.1.1.4. Multipart Types
In general, HTTP treats a multipart message body no differently than any other media type: strictly as payload. HTTP does not use the multipart boundary as an indicator of message body length. In all other respects, an HTTP user agent SHOULD follow the same or similar behavior as a MIME user agent would upon receipt of a multipart type.
That last part seems completely wrong; a web browser is not expected to handle multipart/alternative or multipart/related in the way a mail reader would. (This requirement came from RFC 2616, but... it was wrong then too.)
It was right back in the days of Mosaic for X. It isn't implemented by browsers today. Removed in 2050.
The MIME header fields within each body-part of a multipart message body do not have any significance to HTTP beyond that defined by their MIME semantics.
This is not true of multipart/byteranges; in RFC 2616 that was explained separately, but that explanation got lost in httpbis rewrites at some point.
Suggested rewrite for the second and third paragraphs:
In general, HTTP treats a multipart message body no differently than any other media type: strictly as payload. The one exception is the "multipart/byteranges" type (Appendix A of [Part5]) when it appears in a 206 (Partial Content) response. In all other cases, the MIME header fields within each body-part of a multipart message body do not have any significance at the HTTP level; they are just part of the representation data.
(This drops the newly-added "HTTP does not use the multipart boundary as an indicator of message body length", but that is already implied by the removal of 2616's prohibition on epilogue data; if the multipart is allowed to have an epilogue, then the final boundary doesn't indicate the end of the body anyway. It also drops the "unrecognized multipart subtype" text, which was already irrelevant given the "strictly as payload" rule anyway.)
A similar rewrite was done in 2050.
3.1.3.1. Language Tags
In summary, a language tag is composed of one or more parts: A primary language subtag followed by a possibly empty series of subtags:
language-tag = <Language-Tag, defined in [RFC5646], Section 2.1>
Kinda weird... the text sets you up to expect an actual grammar for language-tag, but then you just get a cross-reference. I'd rearrange stuff to:
... HTTP uses language tags within the Accept-Language and Content-Language fields.
language-tag = <Language-Tag, defined in [RFC5646], Section 2.1>
A language tag is composed of one or more parts: A primary language subtag followed by a possibly empty series of subtags. White space is not allowed within the tag and all tags are case-insensitive. Example tags include:
en, en-US, es-419, az-Arab, x-pig-latin, man-Nkoo-GN
See [RFC5646] for further information.
(also dropping the language-subtag-registry ref, since that's covered by the "See [RFC5646]")
Done in 2113.
3.4. Content Negotiation
(such as when many different formats are supported by a user-agent),
no hyphen
Fixed already (and then rewritten later in 2050).
3.4.1. Proactive Negotiation
If the selection of the best representation for a response is made by an algorithm located at the server, it is called proactive negotiation.
That text doesn't motivate the new name. How about:
If the selection of the best representation for a response is made by the server based on preferences indicated by the user agent in its initial request for the resource, it is called proactive negotiation.
Rewritten in 2050.
- It might limit a public cache's ability to use the same response for multiple user's requests.
users' not user's
Rewritten in 2050.
For example, the origin server might not implement proactive negotiation, or it might decide that sending a response that doesn't conform to them is better than sending a 406 (Not Acceptable) response.
Not clear what "them" is. "...that doesn't conform to the user agent's preferences..."
Done in 2050.
3.4.2. Reactive Negotiation
This specification defines the 300 (Multiple Choices) and 406 (Not Acceptable) status codes for enabling reactive negotiation when the server is unwilling or unable to provide a varying response using proactive negotiation.
406 doesn't really "enable reactive negotiation". It just fails to do proactive negotiation.
Fixed in 2114.
Also, should we mention how reactive negotiation is actually done?
This specification defines the 300 (Multiple Choices) status code for enabling reactive negotiation. However, in practice, Web sites wanting to do reactive negotiation will just return a successful response containing a "default" (or proactively negotiated) representation of the resource, which includes within it links that the user can follow to reach other representations.
I have mentioned other patterns in the parent section and within the 300 code.
Product Tokens
By convention, the products are listed in order of their significance for identifying the application.
"...in decreasing order of...", or something like that. (likewise in the description of User-Agent in 6.5.3 and Server in 8.4.2)
Fixed in 2115.
... more later ...
fielding@gbiv.com commented:
From 2116:
rewrite the sections on retrying requests and pipelining to resolve nonsense about non-idempotent sequences; partly addresses #426
fielding@gbiv.com commented:
From 2118:
reorder paragraphs in method descriptions for consistency; note that CONNECT is not cacheable; partly addresses #426
fielding@gbiv.com commented:
From 2119:
Accept-Language: clean up prose and note descending order of priority for equal weights (as defined in RFC4647 and original HTTP); partly addresses #426
fielding@gbiv.com commented:
From 2120:
(editorial) add section intros; partly addresses #426
fielding@gbiv.com commented:
5.2.2. Idempotent Methods
Section 6.2.2.1 of Part1 implies that the concept of "idempotent sequences of request methods" (as opposed to merely "idempotent methods") will be discussed here, but it's not. I'm not sure if it should be added here or there.
Rewritten there in p1 2116.
5.3.1. GET
The semantics of the GET method change to a "partial GET" if the request message includes a Range header field ([Part5]).
"a Range or If-Range header field"
No, If-Range has no meaning without Range.
5.3.6. CONNECT
Though obvious, it seems like for consistency's sake, this should end with:
Responses to the CONNECT method are not cacheable.
sigh 2118.
5.3.7. OPTIONS
If no payload body is included, the response MUST include a Content-Length field with a field-value of "0".
Does this actually mean to prohibit servers from using chunked encoding (or "Connection: close" with no Content-Length) in that case? Or is it just supposed to be a reminder that "empty message body" is different from "no message body"?
(Section 9.1.2 has basically the same text.)
Yes, they were designed to require a specific indicator of no body for the sake of persistent connections.
If no Max-Forwards field is present in the request, then the forwarded request MUST NOT include a Max-Forwards field.
"If no Max-Forwards field is present in the upstream request, then the downstream request MUST NOT include a Max-Forwards field."
Already rephrased in 2064.
6.2. Conditionals
The HTTP/1.1 conditional request mechanisms are defined in [Part4].
"and [Part5]" (If-Range)
That is noted in Part4.
6.3. Content Negotiation
6.1 and 6.2 had some introductory text before the table, and it seems weird to not have that here.
(6.4 and 6.5 have the same problem)
Fixed in prior edits and 2020.
6.3.1. Quality Values
Should this section be called "Weight" now?
I don't think so, mostly for historical reasons.
6.3.5. Accept-Language
would mean: "I prefer Danish, but will accept British English and other types of English". (see also Section 2.3 of [RFC4647])
Capitalize "See"
Led to a larger rewrite in 2119.
... more later ...
fielding@gbiv.com commented:
From 2122:
(editorial) explain empty Allow field for 405; misc typos; partly addresses #426
fielding@gbiv.com commented:
Response Status Codes
The status-code element is a 3-digit integer result code of the attempt to understand and satisfy the request.
"...a 3-digit integer code giving the result of the attempt..."
o 2xx (Successful): The action was successfully received, understood, and accepted
"The request was successfully..."
Both fixed by Julian 1964.
7.1. Overview of Status Codes
The reason phrases listed here are only recommendations -- they can be replaced by local equivalents without affecting the protocol.
That suggests you can/should translate them into other languages, which isn't really what they're for and kind of contradicts p1 3.1.2's "A client SHOULD ignore the reason-phrase content."
They can be (and often are) localized in practice. The client SHOULD ignore them, yes, but that doesn't mean servers don't have to respect local requirements regarding their own language use.
| 415 | Unsupported Media Type | Section 7.5.13 | | 416 | Requested range not | Section 3.2 of | | | satisfiable | [Part5] | | 417 | Expectation Failed | Section 7.5.14 |
The capitalization of "Requested range not satisfiable" is inconsistent with the rest of the table.
Fixed by Julian 1964. I've shortened it to Range Not Satisfiable.
7.2. Informational 1xx
A client MUST be prepared to accept one or more 1xx status responses prior to a regular response, even if the client does not expect a 100 (Continue) status message.
No reason to call out 100 Continue specifically here... "A client MUST be prepared to accept one or more 1xx status responses prior to a regular response, even if the client does not expect one."
Yep, fixed in 2122.
7.3.2. 201 Created
If the newly created resource's URI is the same as the Effective Request URI, this information can be omitted
"effective request URI" is not capitalized like that anywhere else. (Well, except for once more later on in this section which should also be fixed.)
Fixed in 2105.
If the action cannot be carried out immediately, the server SHOULD respond with 202 (Accepted) response instead.
"with a 202 (Accepted) response"
Fixed by Julian 1964.
8.1.1.2. Date
- If the response status code is 100 (Continue) or 101 (Switching Protocols), the response MAY include a Date header field, at the server's option.
Is that really supposed to be limited to 100 and 101, and not other 1xx codes?
No, already rewritten to fix that.
8.1.3. Retry-After
This field MAY also be used with any 3xx (Redirection) response to indicate the minimum time the user-agent is asked to wait
No hyphen in "user agent"
Fixed by Julian.
8.4.1. Allow
Allow = #method
Should that be 1#method? If not, it should explain what an empty "Allow" header means.
Yes, explained in 2122.
9.1.1. Procedure
HTTP method registrations MUST include the following fields:
Should "cacheability" be an explicit field (rather than just a required part of the specification text)?
We discussed this in another issue and decided that it was too complex an issue for a simple checkmark.
9.3. Header Field Registry
It seems weird to have this in p2 since p1 defines headers too...
A registry is primarily for linking from name to semantics.
9.3.1. Considerations for New Header Fields
o Whether it is appropriate to list the field-name in the Connection header field (i.e., if the header field is to be hop-by-hop, see Section 6.1 of [Part1]).
should have a semicolon rather than comma after "hop-by-hop". (So that it doesn't read like it's telling you to only follow the xref if the header field is hop-by-hop.)
Fixed by Julian 1964.
10.1. Transfer of Sensitive Information
Four header fields are worth special mention in this context: Server, Via, Referer and From.
"Via" is in p1 though, so the Via bits should be moved to p1's Security Considerations? (Or maybe if we end up with a p0, all of the security considerations should be consolidated there.)
I think it belongs here.
The information sent in the From field might conflict with the user's privacy interests or their site's security policy, and hence it SHOULD NOT be transmitted without the user being able to disable, enable, and modify the contents of the field. The user MUST be able to set the contents of this field within a user preference or application defaults configuration.
Do any browsers actually ever send the "From" header? If not, should we just say "From is for robots, not browsers"?
I rewrote this in 2054.
Appendix C. Changes from RFC 2616
Remove base URI setting semantics for "Content-Location" due to poor implementation support, which was caused by too many broken servers emitting bogus Content-Location header fields, and also the potentially undesirable effect of potentially breaking relative links in content-negotiated resources. (Section 3.1.4.2)
That would parse better if the "which was..." clause was parenthesized rather than just set off by commas.
Fixed by Julian and then rewritten again my me in 2083.
Failed to consider that there are many other request methods that are safe to automatically redirect, and further that the user agent is able to make that determination based on the request method semantics.
This is written in the opposite style from the rest of the list (it describes the problem with 2616 rather than the solution in httpbis). Should be something like:
Allow automatic redirection of all "safe" methods, not just GET and HEAD, and give the user agent more latitude in redirecting unsafe methods. (Section 7.4)
Rewritten in 2083.
Thanks for your detailed comments; all have been addressed or explained above.
incorporated
new
to closed
fielding@gbiv.com changed milestone from unassigned
to 22
@mnot changed summary from p2 editorial feedback
to p2 editorial feedback 2
If "Expires" is considered "representation metadata", then it seems like "ETag" and "Last-Modified" should be as well. But I think it would make more sense to just remove "Expires" from the list; it's clearly the odd man out here.
It's not clear what requirements this is referring to; RFC 2277 places requirements on protocol authors, not on implementors, and RFC 3629 is just the definition of UTF-8. If the requirement is "implementations MUST support UTF-8" then we should say that.
That last part seems completely wrong; a web browser is not expected to handle multipart/alternative or multipart/related in the way a mail reader would. (This requirement came from RFC 2616, but... it was wrong then too.)
This is not true of multipart/byteranges; in RFC 2616 that was explained separately, but that explanation got lost in httpbis rewrites at some point.
Suggested rewrite for the second and third paragraphs:
In general, HTTP treats a multipart message body no differently than any other media type: strictly as payload. The one exception is the "multipart/byteranges" type (Appendix A of [Part5]) when it appears in a 206 (Partial Content) response. In all other cases, the MIME header fields within each body-part of a multipart message body do not have any significance at the HTTP level; they are just part of the representation data.
(This drops the newly-added "HTTP does not use the multipart boundary as an indicator of message body length", but that is already implied by the removal of 2616's prohibition on epilogue data; if the multipart is allowed to have an epilogue, then the final boundary doesn't indicate the end of the body anyway. It also drops the "unrecognized multipart subtype" text, which was already irrelevant given the "strictly as payload" rule anyway.)
Kinda weird... the text sets you up to expect an actual grammar for language-tag, but then you just get a cross-reference. I'd rearrange stuff to:
... HTTP uses language tags within the Accept-Language and Content-Language fields.
A language tag is composed of one or more parts: A primary language subtag followed by a possibly empty series of subtags. White space is not allowed within the tag and all tags are case-insensitive. Example tags include:
See [RFC5646] for further information.
(also dropping the language-subtag-registry ref, since that's covered by the "See [RFC5646]")
no hyphen
That text doesn't motivate the new name. How about:
If the selection of the best representation for a response is made by the server based on preferences indicated by the user agent in its initial request for the resource, it is called proactive negotiation.
users' not user's
Not clear what "them" is. "...that doesn't conform to the user agent's preferences..."
406 doesn't really "enable reactive negotiation". It just fails to do proactive negotiation.
Also, should we mention how reactive negotiation is actually done?
This specification defines the 300 (Multiple Choices) status code for enabling reactive negotiation. However, in practice, Web sites wanting to do reactive negotiation will just return a successful response containing a "default" (or proactively negotiated) representation of the resource, which includes within it links that the user can follow to reach other representations.
"...in decreasing order of...", or something like that. (likewise in the description of User-Agent in 6.5.3 and Server in 8.4.2)
Section 6.2.2.1 of Part1 implies that the concept of "idempotent sequences of request methods" (as opposed to merely "idempotent methods") will be discussed here, but it's not. I'm not sure if it should be added here or there.
"a Range or If-Range header field"
Though obvious, it seems like for consistency's sake, this should end with:
Responses to the CONNECT method are not cacheable.
Does this actually mean to prohibit servers from using chunked encoding (or "Connection: close" with no Content-Length) in that case? Or is it just supposed to be a reminder that "empty message body" is different from "no message body"?
(Section 9.1.2 has basically the same text.)
"If no Max-Forwards field is present in the upstream request, then the downstream request MUST NOT include a Max-Forwards field."
"and [Part5]" (If-Range)
6.1 and 6.2 had some introductory text before the table, and it seems weird to not have that here.
(6.4 and 6.5 have the same problem)
Should this section be called "Weight" now?
Capitalize "See"
"...a 3-digit integer code giving the result of the attempt..."
"The request was successfully..."
That suggests you can/should translate them into other languages, which isn't really what they're for and kind of contradicts p1 3.1.2's "A client SHOULD ignore the reason-phrase content."
The capitalization of "Requested range not satisfiable" is inconsistent with the rest of the table.
No reason to call out 100 Continue specifically here... "A client MUST be prepared to accept one or more 1xx status responses prior to a regular response, even if the client does not expect one."
"effective request URI" is not capitalized like that anywhere else. (Well, except for once more later on in this section which should also be fixed.)
"with a 202 (Accepted) response"
Is that really supposed to be limited to 100 and 101, and not other 1xx codes?
No hyphen in "user agent"
Should that be 1#method? If not, it should explain what an empty "Allow" header means.
Should "cacheability" be an explicit field (rather than just a required part of the specification text)?
It seems weird to have this in p2 since p1 defines headers too...
should have a semicolon rather than comma after "hop-by-hop". (So that it doesn't read like it's telling you to only follow the xref if the header field is hop-by-hop.)
"Via" is in p1 though, so the Via bits should be moved to p1's Security Considerations? (Or maybe if we end up with a p0, all of the security considerations should be consolidated there.)
Do any browsers actually ever send the "From" header? If not, should we just say "From is for robots, not browsers"?
That would parse better if the "which was..." clause was parenthesized rather than just set off by commas.
This is written in the opposite style from the rest of the list (it describes the problem with 2616 rather than the solution in httpbis). Should be something like:
Allow automatic redirection of all "safe" methods, not just GET and HEAD, and give the user agent more latitude in redirecting unsafe methods. (Section 7.4)
Reported by @mnot, migrated from https://trac.ietf.org/trac/httpbis/ticket/426