quicwg / qlog

The IETF I-D documents for the qlog format
Other
77 stars 12 forks source link

Define extension schema policy and registry #415

Open LPardue opened 3 months ago

LPardue commented 3 months ago

Alternative to #284.

This follows Lennox's suggestion to take influence from RFC 8285 and reference extension schema according to a URI. IETF documents can use URNs under a formal namespace, and as such we register a new qlog URN Sub-namespace for Registered Protocol Parameter Identifiers.

Fixes #283

LPardue commented 3 months ago

@JonathanLennox I tried to borrow a lot from RFC 8285 but not take it fully verbatim. Would welcome your thoughts.

JonathanLennox commented 3 months ago

This matches my suggestion, looks good to me!

LPardue commented 3 months ago

I pushed some changes to update the text in h3, if we like that approach I will carry it over to the quic doc.

LPardue commented 3 months ago

On thinking some more, I think allowing extension of categories with new events is a complication we can avoid. Specifically, allowing different documents to extend a category risks non-unique event names if the document author is not careful with type names. For example. if ext-schema-a and ext-schema-b both extend h3 with a my-ev-type event (concating to the name h3:my-ev-type, then they'll be fine until someone tries to use both. It's going to be very hard to police event type names, especially for private extensions.

Instead, we can enforce that new schema need to define new globally-unique categories. For example, if we wanted to add new quic events in the future as part of a rollup extension schema, we could define quic-2030, recovery-2030 etc. Adding this to the guidance pushes private extensions to avoid the problem, and requires standard extensions to do it.

I pushed an update that attempts to explain all this.

LPardue commented 3 months ago

As you'll see, a few remarks, most tiny, but a few that imo should be tackled.

I'm personally not a big fan of disallowing extensions to extend existing protocol/category namespaces... I get the rationale, but I think this will cause unnecessary churn and overhead for implementers (listing all extension docs in additional_event_schema, requesting addition to IANA, tools having to decide how to validate etc.). That said... I don't really have a better solution atm to the issues this solves :(

I don't think it's that much extra work really. An IANA registration request is as simple as an email that will be forwarded to the designated experts (probably you, me and anyone else that wants to volunteer). Lazy implementations can omit the additional schema field at the cost of risk that tools will break.

If we allowed extension to categories while only recommending that additional_event_schema is provided, we risk tools having to deal with ambiguous types, which sucks.

If we do keep this, imo we do need to add some guidance on naming of new extensions for existing qlog definitions.

For example, https://www.ietf.org/archive/id/draft-ietf-tsvwg-careful-resume-07.html#name-cr_phase-event, is really a good fit for "quic:recovery", but wouldn't be able to just use that... so what should it do?

It can't just do quic-ext:recovery because that would pollute quic-ext for all other extensions... it can't do quic-rfcXYZA:recovery because juck. It might do quic-careful-resume:recovery but this should probably be recommended by the main spec to prevent a weird growth of these things going forward (UNLESS you expect all this to be caught in the IANA registration process, where expert reviews can catch quic-ext and require it be renamed to something like quic-careful-resume? even then though... a bit of prose with an example could be nice here).

The careful resume event is a good example but I disagree about your analysis. The definition of the event leaves it completely ambiguous about what category it belongs to, so it is unclear what it's serialized name would be. You seem to assume the event is for QUIC, but careful resume is a transport-independent. So, assuming we get this PR hammered out, the careful resume document would have clearer guidance to follow about defining an extension schema. My recommendation would for it to define a new URI like urn:ietf:params:qlog:careful-resume and category identifier cr, then the event type could be phase_updated and it's serialized name would be cr:phase_updated.

rmarx commented 3 months ago

If we allowed extension to categories while only recommending that additional_event_schema is provided, we risk tools having to deal with ambiguous types, which sucks.

I agree, and that's why I said I also don't see a better solution (sadly) :)

The careful resume event is a good example but I disagree about your analysis

So ok, maybe careful-resume can get away with its own identifier outside of quic :) but then let's assume a loss-bits extension that is specific to QUIC's header and defines the loss_bits_updated event. Then it's still a bit unclear if they should have urn:ietf:params:qlog:loss-bits or urn:ietf:params:qlog:quic-loss-bits or urn:ietf:params:qlog:quic-ext-loss-bits or... (or if this even matters as much ;)) and thus if we should have a bit of guidance for that (i.e., extensions of existing protocols/categories)

LPardue commented 3 months ago

I don't think the schema identifier is that important, it's just a unique thing.

Guidance for category identifiers would be useful but I'm not sure we have enough experience to be that helpful. It should reflect the expert guidance, the the category name should be specific and unique, and also keep in mind that it can't be extended in future. So if there's a loss bit extension schema that wanted to have different categories if could have urn:ietf:params:qlog:loss-bits#quic-loss-bits and urn:ietf:params:qlog:loss-bits#sctp-loss-bits.

That does raise the question whether we should allow adding new categories to existing schema. That seems something that is OK.

rmarx commented 6 days ago

As mentioned above, I remain on the edge on what exactly we would allow to be extended by new documents and why...

Recent comments from Lucas say things to the effect of

That does raise the question whether we should allow adding new categories to existing schema. That seems something that is OK.

but earlier he said it's not ok to add new events to existing categories, because you get ambiguous event names (category + type)

Specifically, allowing different documents to extend a category risks non-unique event names if the document author is not careful with type names

So there's an important nuance here: extending an existing category in an existing schema with new event types is NOT the same as extending an existing schema with a new category and new events types

For example:

    urn:ietf:params:qlog:http is the DOC/schema
    urn:ietf:params:qlog:http#h3 is the category
    urn:ietf:params:qlog:http#h3:frame-created is the type

If you allow new docs to re-use the h3 category:

    Document L defines:
    urn:ietf:params:qlog:http#h3:new-event-Z

    Document K also defines:
    urn:ietf:params:qlog:http#h3:new-event-Z

Then you indeed get collisions / ambiguities (because both would use the URI urn:ietf:params:qlog:http#h3 in additional_event_schemas). So far, so good, and I agree that's a problem.

HOWEVER, I don't really see how only re-using the schema itself would (completely) prevent that problem, except for IANA enforcing uniqueness.

If you allow new docs to only re-use the http schema and not the h3 category:

    Document A defines:
    urn:ietf:params:qlog:http#h4:new-event-X
    urn:ietf:params:qlog:http#h4:new-event-Z

    Document B also defines:
    urn:ietf:params:qlog:http#h4:new-event-Y 
    urn:ietf:params:qlog:http#h4:new-event-Z         

As long as both docs only define unique event types (new-event-X and -Y here), then indeed, no problem. But there's nothing stopping two independent docs from ending up with h4:new-event-Z independently.

At that point, you fall back to "whoever is first to register the h4 category identifier with IANA". If that's the intent, then I can understand that and (probably) live with it. But that isn't entirely clear from the current text imo (at least the part that it's allowed to extend an existing schema. So I'd add an example to say that another new document could register urn:ietf:params:qlog:john#cleese event though urn:ietf:params:qlog:john already existed. And ideally also the opposite, explicitly saying that a new document is NOT allowed to just add a new event type new-event-S to urn:ietf:params:qlog:john#goodman, even if there isn't such an event in currently "known" documents that use :john

I mainly want to confirm the intent here and make sure we make this as clear as possible, since even I am having a bit of a hard time fully nailing it down :)

LPardue commented 6 days ago

Putting aside the actual text I wrote for a moment, the logical structure I have in mind is

schema namespace
  category identifier
    event type

Then, the additional_event_schemas lists URIs that must include the schema namespace and category identifier but not the event type.

IETF documents intended to be RFC, should use a URI of format urn:ietf:params:qlog:<schema namespace>#<category identifier.

Any other legal URI is also allowed for private extensions. There is still a requirement that the URI includes both the <schema namespace> #<category identifier>.

For IETF documents, the need to register schema namespace and category identifier (the URI is a byproduct of this, not the definition IMO, and that is what is causing some friction IMO). If a namespace already exists, it should IMO be allowed to add categories to it. IANA can deal with that.

If somehow the IETF has adopted work items are are, in parallel, defining conflicting things (schema namespaces, category identiers, or event types) that is something I believe can be resolved through the regular standards process. Its likely a simple change that benefits both parties. I expect this to happen well in advance of the formal IANA action (i.e. during WG review, or LC).

One way to present this a bit more clearly is an IANA table that focuses more on the components and less on the URI e.g.

Schema Namespace Category ID Description Link
quic recover For events related to QUIC recovery RFC link
rick pickle For events related to pickled Richards RFC link

In other words, it won't matter if a schema namespace exists or not for practical purposes. Experts will still need to check if the addition of a category to an existing namespace makes sense for logical purposes though.

rmarx commented 6 days ago

Reading your last reply, I think I correctly understand your intent, and I think it's the right way forward indeed :) So now we just need to clarify the text to make the intent clearer to new readers. As I said up-top somewhere as well, explaining things like you do above with urn:ietf:params:qlog:<schema namespace>#<category identifier> would already help a lot.


Another thing I think we should do, is have the main schema also define its own schema namespace ("main") with two categories ("generic" and "simulation"). Currently, these are kind of just "there" and implied... We can then use those as examples without having to go out-document, and it's more consistent/cleaner.

It also makes it possible to indicate in additional_event_schemas whether or not you're actually using "generic" and "simulation" events (probably not). That DOES mean we should re-bikeshed the additional part of the name (and the idea of only having "extension" schemas) maybe :)

LPardue commented 5 days ago

Yeah this all makes sense to me, #rollsupsleaves