tobie / specref

An open-source, community-maintained database of Web standards & related references.
http://www.specref.org/
Apache License 2.0
165 stars 141 forks source link

Add browser-specs as new source of specs #750

Closed tidoust closed 1 year ago

tidoust commented 1 year ago

The w3c/browser-specs project contains a curated list of specifications that compose the web platform. Most of the info it contains comes from the W3C API (that info is already in Specref), and from Specref itself.

That said, the list also contains additional specs that are not in Specref. These are mostly "early" drafts developed in W3C community groups but not yet published as Working Drafts, and WebGL extension specs.

With this update, Specref now retrieves specs from browser-specs that it does not yet know anything about to fill a browser-specs.json file. This is useful for spec editors, because specs in browser-specs get crawled and ingested in the cross-reference database of terms used by Bikeshed and Respec, meaning that a spec editor can currently end up in a situation where they can reference a term defined in a spec and yet are unable to reference the spec itself. Currently, this affects 170 specs, which this update adds to Specref.

The script runs first in run-all to drop specs from browser-specs.json as soon as possible (removal from the browser-specs source should typically mean that the spec has started to appear in another source, and it's going to be better to rely on that other source directly).

The logic is based on fetch-refs.js with custom code to filter out specs that are already in Specref, and to delete former entries that no longer exist in browser-specs.

One question: the spec ID in browser-specs is lowercase. Code converts them to uppercase IDs for Specref. Or is it better to keep lowercase IDs along with the createUppercaseRefs aliasing logic?

I'm happy to maintain that code over time, as needed, e.g., if we realize that spec transitions to more authoritative sources (IETF, W3C, etc.) is not as smooth as it should be.

Via https://github.com/w3c/browser-specs/issues/975

tidoust commented 1 year ago

We might want to generalize this to handle WICG specs the same way (in a follow PR).

And get right of the somewhat unmaintained WICG JSON file as a source? Yes, that should work. All WICG specs in Specref should be in browser-specs already, and code in browser-specs can already fetch the info it needs from the spec itself when needed.

There's just a minor hiccup in browser-specs right now with the extraction of the spec status from ReSpec drafts that would make browser-specs miss a number of cg-draft statuses: https://github.com/w3c/browser-specs/issues/944. I'll try to fix that.