w3c / webref

Machine-readable references of terms defined in web browser specifications
https://w3c.github.io/webref/
MIT License
310 stars 72 forks source link
css definitions-data idl web-platform

Webref

Description

This repository contains machine-readable references of CSS properties, definitions, IDL, and other useful terms that can be automatically extracted from web browser specifications (see a list of projects known to use the data). The contents of the repository are updated automatically every 6 hours (although note information about published /TR/ versions of specifications are updated only once per day).

Specifications covered by this repository are technical Web specifications that appear in browser-specs.

The main branch of this repository contains automatically-generated raw extracts from web browser specifications. These extracts come with no guarantee on validity or consistency. For instance, if a specification defines invalid IDL snippets or uses an unknown IDL type, the corresponding IDL extract in this repository will be invalid as well.

The curated branch contains curated extracts. Curated extracts are generated from raw extracts in the ed folder by applying manually-maintained patches to fix invalid content and provide validity and consistency guarantees. The curated branch is updated automatically whenever the main branch is updated, unless patches need to be modified (which requires manual intervention). Curated extracts are published under https://w3c.github.io/webref/ed/.

Additionally, subsets of the curated content get manually reviewed and published as NPM packages on a weekly basis:

Important: The curated extracts only contain data for specifications that are in good standing (to keep the number of manually-maintained patches minimal and manageable). The NPM packages only contain curated extracts of specifications that are in good standing and that target web browsers.

Important: Unless you are ready to deal with invalid content, we strongly recommend that you process contents of the curated branch or NPM packages instead of raw content in the main branch.

Available extracts

This repository contains raw and curated information about nightly versions of Web specifications in the ed folder, as well as raw information about the released version (for /TR/ specifications) in the tr folder.

Note: The tr folder only contains information about released specifications. Specifications that have not been published as /TR/ documents (such as WHATWG specifications or Community Group reports) do not appear under the tr folder in particular.

More often than not, released versions of specifications are much older than their nightly version. Data in the tr folder is more invalid/inconsistent than data in the ed folder as a result. Additionally, no attempt is being made at curating data in the tr folder, use the tr folder at your own risk!

The following subfolders in the curated branch contain individual machine-readable JSON or text files generated from specifications:

Individual files are named after the shortname of the specification, or after the shortname of the specification series for CSS definitions and raw IDL files. Individual files are only created when needed, meaning when the specification actually includes relevant terms.

The ed/index.json file contains the index of specifications that have been crawled, and relative links to individual files that have been created.

This repository uses Reffy, a Web spec exploration tool, to crawl the specifications and generate the data. In particular, the data it contains is the result of running Reffy. The repository does not contain any more data.

Raw WebIDL extracts are used in web-platform-tests, please see their interfaces/README.md for details.

Curation guarantees

Data curation brings the following guarantees.

Web IDL extracts

CSS extracts

Elements extracts

Events extracts

Known consumers

The following projects are known to use webref data:

Using webref data in a project that is not yet in the list? Let us know!

Potential spec anomalies

Webref extracts are analyzed with Strudy to detect potential spec content anomalies such as broken links or invalid constructs, and report them as issues in the repository that hosts the spec.

Global analyses used to be published in the w3c/webref-analysis repository. That repository was archived in August 2024.

How to suggest changes or report an error

Feel free to raise issues in this repository as needed. Note that most issues likely more directly apply to underlying tools:

Development notes

GitHub Actions workflows are used to automate most of the tasks in this repo.

Data update

Releases to NPM