httpwg / http-extensions

HTTP Extensions in progress
https://httpwg.org/http-extensions/
447 stars 146 forks source link

query: format for Accept-Query field #2934

Open reschke opened 3 weeks ago

reschke commented 3 weeks ago

1) "#mediatype (for consistency, for example, with PATCH: https://greenbytes.de/tech/webdav/rfc5789.html#accept-patch) 2) Structured Fields, because that's how we define new fields (and one could argue that we shouldn't make up reasons not to use that format)

My proposal is 2).

Acconut commented 3 weeks ago

I'm also in favor of 2).

darrelmiller commented 3 weeks ago

I am in favour of 2)

reschke commented 3 weeks ago

We'll use SF in the next draft, and see whether somebody complains...

zenomt commented 3 weeks ago

sorry for not chiming in sooner. i'm in favor of (1) #media-range (not #media-type, to allow for wildcarding), both for consistency with Accept, Accept-Patch, and Accept-Post, as well as that if it's a structured field with a list of tokens (as proposed by Martin), then a media type and its parameters is essentially an opaque string, and

application/something; param1="foo"; param2="bar"

would be distinct from

application/something; param2="bar"; param1="foo"

even though they are semantically the same, and still require parsing a media-range out of the SF value.

reschke commented 3 weeks ago

Good point. @martinthomson - why exactly would you want the parameters be part of the string item? That would make parsing harder than it has to be.

(I assume there was a good reason for your proposal, but which?)

zenomt commented 3 weeks ago

notwithstanding the issue with parameter order i just mentioned, if this is a structured field, then you need to do the SF parsing to get tokens, and then parse each one as a media-range anyway. so why not just media-ranges without an extra SF wrapper?

reschke commented 3 weeks ago

One reason we (for some value of "we") prefer SF is that we've been consistently telling others not to define new fields in a non-SF syntax - even for consistency with other existing fields.

zenomt commented 3 weeks ago

my reading of RFC 9651 suggests that a SF that is a List of Token Items (each of which can be parameterized) is syntactically equivalent (and mostly semantically equivalent) to #media-range (from RFC 9110). that is

Accept-Query: application/something; param1="foo", application/other; profile="bar"; param="baz"

conforms to both #media-range from RFC 9110, and sf-list from RFC 9651 where the sf-items are sf-token parameters.

martinthomson commented 2 weeks ago

The reason that I suggested parameters as part of the string/token is that I'm not 100% confident that media type parameters fit the SF parameter syntax. If you do the analysis and conclude that it is OK, then that might be OK. Finally, I see q=0.9 being a field value extension, whereas charset="utf8" is a media type extension, so maybe having distinct means of extension makes that separation possible.

zenomt commented 2 weeks ago

i think there is a complication in the syntax: the SF ABNF only allows parameters to be separated from each other by SP and with no SP before a semicolon, whereas media-range allows OWS (SP and HTAB) to separate parameters and to occur before semicolons too.

zenomt commented 2 weeks ago

the SF Token Item production is compatible with media-range though. i wouldn't object to the definition of the field being that it encodes a "media range and its parameters" as an sf-list of SF Token Items and their parameters, rather than requiring the field itself to be #media-range. but i do think it's objectionable to have a media-range inside an sf-string, since that still requires separate parsing of each string value as a media-range, and (at the SF layer) destroys the semantic distinction of the parameters.

reschke commented 2 weeks ago

@martinthomson - we do have that separation for similar fields. OTOH, putting this into a string - essentially a micro syntax - is asking for trouble.

@zenomt - the goal is not to use an existing non-SF parser.

@mnot - I see that "sfbis" discourages using token in new field definions: "Note that Tokens are defined largely for compatibility with the data model of existing HTTP fields and may require additional steps to use in some implementations." - what does this mean (assuming the implementation has a conforming SF parser)? And does it apply here? AFAICT, we explicitly defined tokens so this kind of strings could be used.

zenomt commented 2 weeks ago

@reschke :

@zenomt - the goal is not to use an existing non-SF parser.

by "[the] production is compatible with media-range though" i only meant that, if limiting oneself to the constraints of SF (no HTABs, no spaces before semicolons), the item and its parameters would look like (and be compatible with) ordinary media ranges, which i think will reduce confusion and cognitive load on people, and will allow the items to be used as-is with existing media type and content negotiation tools, while still being a Structured Field™.

mnot commented 1 week ago

The problem is that if the difference between tokens and strings is semantically significant, an implementation needs to be able to represent that in its API, which can be unpleasant/unidiomatic (e.g., putting one of them in a wrapper class).

mnot commented 1 week ago

Discussed with @reschke offline. Make it a Token or String, with explicit semantic equivalence between them. Allow Parameters that have Strings (only) as values.

zenomt commented 1 week ago

Make it a Token or String, with explicit semantic equivalence between them. Allow Parameters that have Strings (only) as values.

from context i assume "it" is the type "/" subtype part of a media-type or media-range.

it seems to me that the point of the Accept-Query field is to convey one or more parameterized media types/ranges that the URI understands for a QUERY, just like Accept, Accept-Post, and Accept-Patch do for their purposes. since a media type/range has an established meaning and syntax in a larger ecosystem beyond this one field, i think that if this field has to be a Structured Field, then in the spirit of @mnot 's Retrofit Structured Fields for HTTP we should impose only the minimum necessary constraints on serializing a media type/range to conform to the syntax and semantics of a SF, while not expanding the field syntax to be incompatible with media-type or media-range.

i think the minimum necessary constraints on a media type or range would be "the top-level type can't begin with a digit" (which no IANA-coordinated top-level media types currently do), "no spaces before a parameter's semicolon", "only spaces, no HTABs, between a semicolon and a parameter", and "otherwise a valid media-type or media-range according to the ABNF of RFC 9110". a parameter's value should be able to be a token or a quoted string, because it's not necessary to constrain it to only a string and still be a SF (and it can't only be a token because media-type parameters can have spaces and other non-token characters).

likewise, the type "/" subtype top-level type and the subtype, or top-level/*, or */*, should not be able to be a quoted string, because that would be incompatible with the existing syntax of a media-type or media-range.

zenomt commented 1 week ago

@mnot if i misunderstood what you meant, and what you actually meant was that "a parameter's value is a string, which can be serialized as a token or a quoted-string with no semantic difference", then i'm on board with that.