w3c / server-timing

Server Timing
http://w3c.github.io/server-timing/
Other
75 stars 20 forks source link

ABNF for metric-value #12

Closed reschke closed 7 years ago

reschke commented 7 years ago

In 2.1:

Server-Timing        = "Server-Timing" ":" #server-timing-metric
server-timing-metric = metric  [ ";" description ] 
metric               = metric-name [ "=" metric-value ]
metric-name          = token 
metric-value         = 1\*DIGIT [ "." 1\*DIGIT ]
description          = token | quoted-string

This makes it sound as if metric-value can't use quoted-string syntax, that is, you're special-casing the more generic HTTP name=value syntax. Just state that metric-value can be token or quoted-string, and then note that the value (after potential quoted-string processing), will need to conform to the given syntax.

(also nit: what are the backslashes doing there?)

igrigorik commented 7 years ago

Just state that metric-value can be token or quoted-string, and then note that the value (after potential quoted-string processing), will need to conform to the given syntax.

How do we do that in ABNF? =/

The \'s are an escaping bug.

reschke commented 7 years ago

The same way we do it in other specs. For instance: https://greenbytes.de/tech/webdav/rfc7234.html#header.cache-control

yoavweiss commented 7 years ago

Related: We should allow for optional white space in server-timing-metric and in metric. So they should be something like: server-timing-metric = metric [OWS ";" OWS description] metric = metric-name [ OWS "=" OWS metric-value ]

igrigorik commented 7 years ago

@yoavweiss the cache-control examples @reschke linked to don't specify OWS.. @reschke are those implicit / do we need them here?

Also, @reschke I'm still unclear on the proper form here.. any chance you can help define the BNF? It seems that what we have is much more rigorous -- which is good, in my books -- as compared to a general "quoted-string is accepted".

reschke commented 7 years ago

OWS is not implied here.

WRT other question: "more rigorous" is bad when it requires people not to use off-the-shelf parsers to implement it. Just always allow both token and quoted-string.

igrigorik commented 7 years ago

re, OWS: should we be allowing it here? What's the recommended best practice?

Server-Timing        = "Server-Timing" ":" #server-timing-metric
server-timing-metric = metric  [ ";" description ] 
metric               = token [ "=" ( token / quoted-string ) ]
description          = token / quoted-string

Metric argument syntax: 1*DIGIT [ "." 1*DIGIT ]


^ is that closer to what you're suggesting? I'm not sure how to clearly spell out the metric-value connection here.

reschke commented 7 years ago

In general, this seems to be another syntax which is similar to existing fields, but just slightly different, because "description" is special (by position), and there's no extension point for parameters. Which is not good.

I'd model this after recently defined fields such as Alt-Svc or Prefer.

igrigorik commented 7 years ago

Copying for reference...

https://tools.ietf.org/html/rfc7838#section-3

   Alt-Svc       = clear / 1#alt-value
   clear         = %s"clear"; "clear", case-sensitive
   alt-value     = alternative *( OWS ";" OWS parameter )
   alternative   = protocol-id "=" alt-authority
   protocol-id   = token ; percent-encoded ALPN protocol name
   alt-authority = quoted-string ; containing [ uri-host ] ":" port
   parameter     = token "=" ( token / quoted-string )

https://tools.ietf.org/html/rfc7240#section-2

     Prefer     = "Prefer" ":" 1#preference
     preference = token [ BWS "=" BWS word ]
                  *( OWS ";" [ OWS parameter ] )
     parameter  = token [ BWS "=" BWS word ]

Prefer doesn't seem to account for quoted-string. Also, BWS? Ditto for protocol-id and alt-authority on Alt-Svc. Honestly, the more examples I look at, the more confused I am as to what we ought to be writing here. Some hand-holding would be appreciated.. :-) /cc @mnot

reschke commented 7 years ago

See https://www.rfc-editor.org/errata/eid4439 - word means "token or quoted-string".

In name/value pairs, it's the value that is supposed to support token or quoted-string. alt-authority is a bit special because a legal value never can be a token, so that was left off the ABNF.

igrigorik commented 7 years ago

/me squints at the examples. Ok, another try...

Server-Timing        = "Server-Timing" ":" #server-timing-metric
server-timing-metric = metric  [ OWS ";" OWS word ] 
metric               = token [ OWS "="  OWS word ]

If metric value is defined as word how do we ensure that it conforms to 1*DIGIT [ "." 1*DIGIT ]?

reschke commented 7 years ago

OK, let's use Prefer as example (https://www.greenbytes.de/tech/webdav/rfc7240.html#prefer):

  Server-Timing = 1#server-timing
  word = token / quoted-string
  server-timing = token [ BWS "=" BWS word ] 
               *( OWS ";" [ OWS parameter ] )
  parameter  = token [ BWS "=" BWS word ]

That should be sufficient for the syntax. Then, if a specific parameter name or value has a specific sub-syntax, just define that separately, example in https://www.greenbytes.de/tech/webdav/rfc7234.html#cache-request-directive.max-age

igrigorik commented 7 years ago

We don't have parameter specific sub-syntax, all parameter values are of the form: 1*DIGIT [ "." 1*DIGIT ]. Is that something we'll need to define separately still?

p.s. thanks for your help here!

reschke commented 7 years ago

Well, with my proposed change you wouldn't have an implied description property anymore, instead you'd have a parameter called "description" (or something like that), and that value would not be constrained, right?

(FWIW, would you prefer to use JSON here? Asking seriously...)

igrigorik commented 7 years ago

Description is not constrained, but the parameter values is.. e.g. "foo=bar;random" is not valid, whereas "foo=123.12;random" is.

Re, JSON: is JFV still a thing? I thought that was put on pause?

reschke commented 7 years ago

Description is not constrained, but the parameter values is.. e.g. "foo=bar;random" is not valid, whereas "foo=123.12;random" is.

Yes. So state what the value syntax for that specific value is. See Cache-Control definition, for example.

Re, JSON: is JFV still a thing? I thought that was put on pause?

It was taken from the WG agenda, but I'm totally willing to finish the spec and get it published.

mnot commented 7 years ago

It was taken from the WG agenda, but I'm totally willing to finish the spec and get it published.

Be aware that if Julian does this, it won't be a standard; just an individual draft.

reschke commented 7 years ago

Be aware that if Julian does this, it won't be a standard; just an individual draft.

It won't be a RFC originating by a WG. It's not clear yet what kind of RFC (Informational, Experimental, Standards Track) it would be.

mnot commented 7 years ago

That's true, but if you request it to be an independent submission on the standards track, it will be examined by the IESG for potential conflicts with ongoing standards work. There is currently a work item in the HTTP working group that's very similar (and in fact replaced this).

yoavweiss commented 7 years ago

@reschke if I understand your proposal correctly, an example would look like: Server-Timing: metric="foo";value="100.4";description="bar" Is that correct? Don't we need to define valid token values as well?

Maybe something like the following?

  Server-Timing = 1#server-timing
  word = token / quoted-string
  parameter-name = "metric" / "value" / "description"
  parameter  = parameter-name [ BWS "=" BWS word ]
  server-timing = parameter *( OWS ";" [ OWS parameter ] )

/ @cvazac

reschke commented 7 years ago

Server-Timing: metric="foo";value="100.4";description="bar"

That would work, but what I had in mind was:

Server-Timing: foo="100.4";description="bar"

I would recommend against defining parameter names in the ABNF. This is similar to a layer violation.

The ABNF drives the parser, and it's not the parser's role to handle unexpected parameters - it's the job of the code running the parser.

Again (:-) - see https://www.greenbytes.de/tech/webdav/rfc7234.html#header.cache-control as example.

yoavweiss commented 7 years ago

@reschke That'd mean that developers cannot define a metric called "description".

This whole thing would also mean we cannot limit the metric's value to a number (and therefore cannot convert it to one when exposing it to JS). So we're (to an extent) trading off developer convenience for consistency :/

reschke commented 7 years ago

@reschke That'd mean that developers cannot define a metric called "description".

Yes, they could:

Server-Timing: description="100.4";description="bar"

(the point being that you can make the first parameter special)

This whole thing would also mean we cannot limit the metric's value to a number (and therefore cannot convert it to one when exposing it to JS). So we're (to an extent) trading off developer convenience for consistency :/

Of course you can. What makes you think you can't?

cvazac commented 7 years ago

I like the idea of explicitly naming the description param, as it makes adding params in the future obvious.

Regarding metric-value: I don't see what allowing a quoted string gets us here. Best I can tell, the only reason for quoted strings is to be able to include special/delimiting characters, but that shouldn't be necessary for numbers. Also, there's prior art for restricting to numerics for max-age and q in accept (and likely others).

Regarding whitespace (WS): If I'm reading this right, we don't need BWS because there is nothing we need to be backwards compatible with. There is no WS specified in the content-type header, so likely browsers are being too lenient. Doesn't look like they want WS before or after the comma in lists either.

Outside of adding OWS for readability, this would be my preference:

Server-Timing: db=123;description=database,auth=456;description="description with spaces",some-future-param=789

With the ABNF of:

Server-Timing        = "Server-Timing" ":" #server-timing-metric
server-timing-metric = metric *[ ";" parameter ]
metric               = metric-name [ "=" metric-value ]
metric-name          = token
metric-value         = 1*DIGIT [ "." 1*DIGIT ]
parameter            = attribute "=" value
attribute            = token
value                = token / quoted-string
reschke commented 7 years ago

I don't see what allowing a quoted string gets us here. Best I can tell, the only reason for quoted strings is to be able to include special/delimiting characters, but that shouldn't be necessary for numbers. Also, there's prior art for restricting to numerics for max-age and q in accept (and likely others).

The sole reason is that we want people to be able to use generic parsers. If they do, they'll accept both token and quoted-string. Thus, for interop, all should accept that.

max-age is supposed to support token and quoted-string, see https://www.greenbytes.de/tech/webdav/rfc7234.html#cache-response-directive.max-age

q in Accept indeed is special; but that doesn't mean that adding even more special cases is good.

If I'm reading this right, we don't need BWS because there is nothing we need to be backwards compatible with.

That's the problem of the HTTP world having too little guidance for this, so everybody makes their own decision.

There is no WS specified in the content-type header, so likely browsers are being too lenient. Doesn't look like they want WS before or after the comma in lists either.

For HTTP list production, it's not your choice. It's fully defined by the base syntax: https://www.greenbytes.de/tech/webdav/rfc7230.html#abnf.extension

The proposed ABNF looks good to me, except that metric-value is made special.

cvazac commented 7 years ago

max-age is supposed to support token and quoted-string, see https://www.greenbytes.de/tech/webdav/rfc7234.html#cache-response-directive.max-age

Maybe I'm missing something, but: "This directive uses the token form of the argument syntax: e.g., 'max-age=5' not 'max-age="5"'. A sender should not generate the quoted-string form." (https://www.greenbytes.de/tech/webdav/rfc7234.html#rfc.section.5.2.2.8.p.3)

For HTTP list production, it's not your choice. It's fully defined by the base syntax: https://www.greenbytes.de/tech/webdav/rfc7230.html#abnf.extension

Ok, that's great that we have OWS before/after the comma separating items in a list, thanks for pointing that out.

reschke commented 7 years ago

Maybe I'm missing something, but: "This directive uses the token form of the argument syntax: e.g., 'max-age=5' not 'max-age="5"'. A sender should not generate the quoted-string form." (https://www.greenbytes.de/tech/webdav/rfc7234.html#rfc.section.5.2.2.8.p.3)

Yes, for backwards compat. But recipients MUST process both.

igrigorik commented 7 years ago

Server-Timing: description="100.4";description="bar"

I do like the future-proofing to allow extensibility (explicit description), but I'll also flag that we already have existing implementations (e.g. Chrome DevTools) using earlier syntax. /cc @sroussey

igrigorik commented 7 years ago

(bump) ... any conclusions / next steps here?

@sroussey @paulirish how much developer (and implementer) pain would updating the parameter definition cause at this point? See https://github.com/w3c/server-timing/issues/12#issuecomment-309083836 for context. If we're looking at a non-trivial amount.. my personal preference would be to stick with what we have -- we should have caught this earlier.

cvazac commented 7 years ago

copying thoughts obtained via email here:

from @paulirish:

First, I'm okay with changing the format and eventually breaking folks. Ideally I'd like a release of chrome with support for both. We can log a warning to console to raise visibility.

from @igrigorik

Re, ABNF update.. Andrew's example gets a lot more verbose if we add explicit description tags.

backendIP;description="10.1.135.126", backendName;description="39MzaFBISKerFJYsF", ....

from @triblondon

Yeah, I don't like the explicit description tags at all. And it doesn't seem to really solve the problem of distinguishing millisecond duration values from other numeric entries. Why not just require a unit? If the value ends in 'ms', plot it, otherwise, tabulate it. Then just make it a simple list of KV pairs? But I'm late to this discussion and struggling to comprehend all the nuance that's gone before.

It's worth noting, maybe, that this is the first header I'm aware of that has a JavaScript API in the browser and no side effects of use. That means that potentially any server or intermediate proxy that wants to communicate data to a browser outside of the body of the document might stash all kinds of random stuff into Server-Timing. At Fastly, for example, it is common for customers to manipulate headers in edge logic, but we make it hard for them to do much, if anything, with the body (mostly because processing usually occurs before the body has been received). I've been experimenting with stashing JSON in a Server-Timing string value (yes I know, I'm appalling, sorry)

cvazac commented 7 years ago

to summarize where we are: @reschke is proposing explicitly named key/value pairs after the semi-colon, so user agents can re-use existing parsers, like: metricName = 123.4; description = "the description". This is more verbose, but also future-proofs the syntax for extra data.

As @triblondon points out, if a dev needs structured data on a server timing entry, they can use serialized JSON. This would make the header more like this: metricName = 123.4; "anything-goes-here".

cc @igrigorik

igrigorik commented 7 years ago

@reschke is proposing explicitly named key/value pairs after the semi-colon, so user agents can re-use existing parsers, like: metricName = 123.4; description = "the description". This is more verbose, but also future-proofs the syntax for extra data.

In theory, UA's have shared ABNF parsers that can be reused.. In practice, I don't think that's actually true, as each field is it's own snowflake. As such, I'm not sure that's a top criteria to optimize for.

Re, future proofing: it's a nice property, but it does come with its own costs and complexity. For one, it makes the header field much more verbose -- the example that @triblondon shared would almost double in size with all the extra "description="'s. Separately, making this explicit in the header hints to the user that they can provide other fields.. except, we won't surface them or make the available -- at least as specced. FWIW, I think we might be adding unnecessary complexity here..

As @triblondon points out, if a dev needs structured data on a server timing entry, they can use serialized JSON. This would make the header more like this: metricName = 123.4; "anything-goes-here".

Indeed. Not saying that I endorse this, but... :-)

reschke commented 7 years ago

FWIW, the size problem can be partly addressed by picking a shorter name.

igrigorik commented 7 years ago

@reschke true, but having thought (and talked) more about it, I'm leaning back towards keeping it (intentionally) simple. It was never the goal of this API to provide a generic mechanism to shuffle arbitrary data via HTTP headers; the goal is to enable a simple and efficient mechanism to communicate server processing durations and short annotations, which the current format already covers well, I think. If someone wants to communicate larger payloads, they can either JSON-encode or pass a URL in the description -- similar to what SpeedTracer did way back.

So, long story short, I propose we stick with the "name=duration;description".

igrigorik commented 7 years ago

Closed via https://github.com/w3c/server-timing/pull/37#pullrequestreview-73951059.