Closed tidoust closed 3 years ago
I've just pushed some of the work I had started a while ago on this in a branch: https://github.com/tidoust/reffy/blob/backrefs/src/cli/study-backrefs.js
posting it here because I'm bit a stuck on how to integrate this cleanly in the existing study report but this is already kind of usable and useful, so wouldn't want it lost
I've pushed an update to the branch, which outputs the following reports:
It also cannot find match for the following "shortnames" in links: [ "2dcontext", "2dcontext2", "BackgroundSync", "CSS2", "InputDeviceCapabilities", "SMIL3", "SVG11", "SVGTiny12", "ServiceWorker", "accname-1.1", "aria", "aria-practices", "av1-isobmff", "books", "contentEditable", "core-aam-1.1", "css-color-3", "css-fonts-3", "css-grid-1", "css-ui-3", "css-writing-modes-3", "css2", "custom-elements", "design-principles", "dpub-aria-1.0", "dpub-latinreq", "ecma262", "editing", "feature-policy", "fingerprinting-guidance", "ilreq", "jlreq", "klreq", "ltli", "media-frags", "motion-sensors", "role-attribute", "security-privacy-questionnaire", "security-questionnaire", "selectors-3", "smil-animation", "spec", "specs", "speech-synthesis11", "typography", "wai-aria-1.1", "web-audio-perf", "webappsec", "workers", "worklets-1", "wsc-ui", "xforms", "xml", "xml-names", "xml-stylesheet", "xmlbase", "xmlschema-2", "xmlschema11-2", "xpath" ]
Results from the new run that incorporates detection of dated TR URLs, detection of outdated TRs and a more complete shortname map.
Unknown specs: [ "I-D", "aria", "av1-isobmff", "av1-rtp-spec", "av1-spec", "charmod", "charmod-norm", "contentEditable", "css-text-decoration", "dpub-aria-1.0", "ecma262", "editing", "html401", "its20", "ltli", "mathml", "media-frags", "pronunciation-lexicon", "role-attribute", "ruby", "smil", "smil-animation", "spec", "specs", "speech-synthesis11", "trace-context-protocols-registry", "wasm-core", "wcag", "webappsec", "webcomponents", "wsc-ui", "xforms11", "xhtml1", "xlink", "xlink11", "xml", "xml-id", "xml-names", "xml-names11", "xml-stylesheet", "xml11", "xmlbase", "xmlschema-2", "xmlschema11-2", "xpath-10", "xslt" ]
The study now has broken link detection; the report below focuses on that only - quite a few. It's starting to feel like something that spec editing tools themselves could catch early via our collected data (but probably not trivially given the need to map urls to specs)
A backrefs analyzer tool has now been added. It creates the following report. The analysis may be further improved, see "TODO" comments in the code. On top of that, it will be useful to consider mechanisms to generate these reports regularly (the tool is not included in the overall anomaly report), and raise issues automatically in the relevant repos.
We already extract links that appear in specs (although we don't keep fragments for now). Goal is to keep fragments and compare with definitions to analyse:
data-export