manton / JSONFeed

The JSONFeed.org website
Creative Commons Zero v1.0 Universal
943 stars 56 forks source link

consider not using _ to prefix extensions #19

Open kr opened 7 years ago

kr commented 7 years ago

From RFC 6648 ‘Deprecating the "X-" Prefix and Similar Constructs in Application Protocols’:

   Historically, designers and implementers of application protocols
   have often distinguished between standardized and unstandardized
   parameters by prefixing the names of unstandardized parameters with
   the string "X-" or similar constructs.  In practice, that convention
   causes more problems than it solves.  Therefore, this document
   deprecates the convention for newly defined parameters with textual
   (as opposed to numerical) names in application protocols.

This spec's _ prefix seems to qualify as a "similar construct".

Maybe it's too late to do anything about this, and even if not, maybe the arguments in that RFC are not persuasive enough. But I wanted to raise the issue anyway, just in case.

kr commented 7 years ago

For convenience, here's the main point:

Appendix B.  Analysis

   The primary problem with the "X-" convention is that unstandardized
   parameters have a tendency to leak into the protected space of
   standardized parameters, thus introducing the need for migration from
   the "X-" name to a standardized name.  Migration, in turn, introduces
   interoperability issues (and sometimes security issues) because older
   implementations will support only the "X-" name and newer
   implementations might support only the standardized name.  To
   preserve interoperability, newer implementations simply support the
   "X-" name forever, which means that the unstandardized name has
   become a de facto standard (thus obviating the need for segregation
   of the name space into standardized and unstandardized areas in the
   first place).
kr commented 7 years ago

For example, a hypothetical alternate version of the ‘Extensions’ section might have said:

Publishers can use custom objects in JSON Feeds. Any names that aren't described on this page are custom. Custom objects can appear anywhere in a feed.

It’s good practice to name an extension with a company or service name, to provide a clue right away as to what it’s for and who made it. However, if your custom object is useful to most people who read and write feeds, consider the possibility that it might end up becoming a de facto standard whether you want it to or not. If that seems likely, choose a suitably general name.

svenluijten commented 7 years ago

This could also be solved by introducing a new top-level object: extra. It might look like this:

{
  "extra": {
    "my_vendor": {
      "some_key": "value"
    }
  }
}

This extra object would obviously be optional.

manton commented 7 years ago

Thanks @kr and @svenluijten. Good to know about RFC 6648. I lean toward keeping _ because it reads cleanly compared to extra nesting, and is obvious what is not part of the specification, but I don't think I mind extensions that use generic names. For example, in Micro.blog I decided to use _microblog because the extra fields could just as easily be part of a different service. (We debated a few other more unique choices like reverse-DNS strings, etc. but they added a lot of clutter to the format.)

kornelski commented 7 years ago

A popular extension will become a de-facto standard. This has happened many times on the web (apple-touch-icon, meta viewport, tons of -webkit-* CSS supported cross-browser, XFF HTTP header, etc.).

The HTML standard ended up using a central registry for this (https://wiki.whatwg.org/wiki/MetaExtensions) with a very low bar for registration of new names, and it seems to work fine.

So I'd suggest just asking people to coordinate. Decentralized zero-contact extensibility sounds nice, but in reality we can talk to each other, especially that extensions will require cooperation between feed creators and consumers anyway.

ttepasse commented 7 years ago

Given the history of RSS extensions coordination seems too hard. Following RFC 6648 gives you something like this:

"https://blueshed-podcasts.com/json-feed-extension": {
    "explicit": false,
    "copyright": "1948 by George Orwell",
    "owner": "Big Brother and the Holding Company",
    "subtitle": "All shouting, all the time. Double. Plus. Good."
}

Possible spec text:


Of course the developer behind the extension spec can and will go out of business and will delete the docs like so much of RSS extension documents did. But if one or more extensions in the same subject space get's popular and is useful for the whole ecosystem like the podcasting extensions to RSS, I'd argue that it is the job of the spec to document those extension instead of leaving the subject open to different extensions by different sources which all can go under.

kornelski commented 7 years ago

AFAIK RSS did not event attempt to have any sort of central coordination. There was no registry. 2.0 spec didn't even have a clear way to provide feedback.

They've just said to use XML namespaces, and XML namespaces were generally misunderstood and incorrectly implemented (e.g. Sparkle required a specific prefix name instead of using NS URLs).

URL keys are longer, much harder to remember and easier to get wrong (was it http or https? trailing slash?).

And they're solving problem of decentralized extensibility with zero contact, instead of the problem of "we're all going to have to use this weird key if this extension becomes de-facto standard".

kornelski commented 7 years ago

Let's say message stickers become the new rage and somehow every podcast will have to have a sticker pack. We could have:

manton commented 7 years ago

I think @pornel's examples highlight why we went with _sticker. All extension styles have downsides, so best to go with the one that is the most simple and readable. I've seen a few extensions in the wild already, and they don't overwhelm the document. (If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.)

kr commented 7 years ago

Going with just sticker would have been (as far as I can tell) the most simple and readable, even more so than _sticker. No centralized registry is necessary. Same spec, same guidelines, same everything else, just… no underscore.

If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.

Yes! The interesting part is what happens then?

There's a popular extension called _sticker. Everyone loves it and agrees it should be put in the spec, but there's lots of software out in the wild that supports only the underscore-name. You can't change the name because that would fail to interoperate with existing software (remember Rule #1). You can't standardize the existing name because the spec explicitly promises never to specify underscore-names, even in the future. What do you do?

manton commented 7 years ago

It's similar to what happens with CSS extensions, e.g. -webkit-opacity. If it's incorporated into the official spec, people use both for a little while, then eventually we forget about the extension and just use the standard name. It's not perfect, but I think it's better than the conflicts that could happen if there's no prefix on extensions.

kr commented 7 years ago

Yeah, so with CSS, basically only web browsers can add extensions. (Because they're the only software that interprets CSS that people have to interact with over the network.) And they basically only do this when they're trying to get their extension into the standard. And even then it only works because almost all browsers have aggressive auto-update systems. Everyone who's writing CSS files knows this from the start, so they plan for it and it's less of a headache. (But still a big headache that takes years to resolve for each extension.)

The current situation strikes me as more similar to HTTP header fields. The classic example is X-Forwarded-For. There's a Forwarded field in the HTTP RFC, but nobody uses it. Everybody still uses X-Forwarded-For because that's the one that works. With X-Frame-Options (https://tools.ietf.org/html/rfc7034), they realized it would be futile to try to rename it and didn't even bother, they just standardized the X name.

The difference is, with CSS there are basically only five programs in the world you need to worry about, and they auto-update like nobody's business. With HTTP there are hundreds, and many of them are pretty conservative (i.e. slow or never) with updates.

I think JSON feed (especially if it is successful — and I really hope it will be! I love this spec, should've said so to start with) will be more on the HTTP end of the spectrum. As a machine-readable and machine-writable format, it'll have various network services that generate, transform, and interpret it. A feed aggregator is one example. If there is a widely used extension, say, _forwarded_for, that shows the URLs of the upstream feeds, you'll never manage to rename it to forwarded_for even if you want to.

You can specify the new name, but everyone will just keep using the name that is generated by the feed server on their random free multi-tenant web host and still works even on their cousin's five-year-old PC.

In short: I predict the _ prefix will make things more complicated.

the conflicts that could happen if there's no prefix on extensions

I don't totally follow here. Do you have an example of a conflict that would happen with no prefix on extensions, and that would be prevented by putting an underscore in front?

Note that new nonstandard HTTP header fields (these days) don't use any prefix (e.g. DNT), and conflicts are not a problem in practice.

manton commented 7 years ago

Good points. You're right that CSS is not the best comparison, although it's still similar in that I think the most popular extensions will be proposed by feed readers.

As for conflicts, let's imagine that sticker in one extension contains a pair of string values (maybe full_url and thumbnail_url), and in another extension is an object with different members (maybe type, size, and url). It's true that the conflict is there whether the field has an underscore prefix or not. But the problem gets worse if there's no prefix, because if we bring one version of the extension into the spec, and new feed readers adopt it, those feed readers will be surprised when they encounter the same field name with a completely different structure underneath it.

In this scenario it actually becomes difficult to even check whether a feed is valid. As soon as we promoted an extension with a conflicting name like that into the spec, we'd actually invalidate other feeds that used to be perfectly fine (the sticker extension that used the thumbnail_url value, for example, when the spec now says it must be an object with 3 different required members). The prefix should prevent that from happening.

manton commented 7 years ago

As @sonicdoe pointed out in another issue, extensions have to be objects, so my sticker example above was originally too much of a simplification. I've edited it to fix this. I think the point is the same either way. Thanks!

kornelski commented 7 years ago

The situation with one key having two different bodies in two different implementations may happen indeed, but I think it's not too bad for two reasons: a) it can be made less likely, b) even if it happens, it can be resolved.

It can be made less likely by tracking what extensions are used in the wild, and asking people to propose/register extensions they use. It is in implementors' interest to have their extensions widely known, conflict-free, and compatible with other clients.

And even when a conflict happens, it can be fixed:

kr commented 7 years ago

Aha, I see. Yeah, of course I can't say that such a conflict wouldn't happen. (But as @pornel mentioned it's in everyone's interest to avoid conflicts, so they seem unlikely. This matches what's happened so far with HTTP.)

In that unfortunate scenario, I agree the spec wouldn't want pick just one of the two sticker extensions to standardize. I think a reasonable thing to do would be to use a new name such as badge or stickie or whatever. Then this problem reduces to the same problem as with _ prefixes, except now it only happens when there is actually a conflict.

But I appreciate that all that might end up still being a confusing situation. Thanks for hearing me out.

dshanske commented 6 years ago

Reading, it says the prefix applies to new objects, not necessarily extra and established properties inside an existing object. For example, I just added a syndication property to items based on alternate copies of it. This isn't an object.