Open zcorpan opened 3 years ago
this is effectively a dupe of https://github.com/html5lib/html5lib-tests/issues/127 fwiw
@gsnedders oh, right, I had forgotten about that! It seems like there isn't objection. Are you still planning to work on this?
@gsnedders oh, right, I had forgotten about that! It seems like there isn't objection. Are you still planning to work on this?
It is a long way down my list.
A tweak we can make is to depend on html5lib-tests
instead of html5lib-python
from wpt
, which would remove the second step. (I think this was @jgraham 's idea, but don't see it mentioned in GitHub.)
One obvious (easy) tweak given it's using git-submodules
is to explicitly store a commit hash somewhere in WPT and then during update cd html5lib-python/html5lib/tests/testdata && git fetch origin && git checkout $REV
.
My main concern is that I want to preserve the file format for the preferred form form making modifications to the test, since there are non-WPT consumers of those formats.
I'm not a fan of WPT having a build step that transforms the tree builder test format. FWIW, Gecko's mochitest harness stores the original .dat format in the repo and parses it when the tests are run.
Having the sources files in the same format in wpt and parsing them with JS when running sounds ideal actually. Can that parser be migrated to wpt?
Having worked on a parser bug in WebKit I now think this would be even more valuable than I previously thought. It looks like Chromium and WebKit both have two sets of parser tests in the tree:
And the former has tests the latter might not contain. I contributed further to this problem in https://github.com/WebKit/WebKit/pull/12019, but am willing to be part of the cleanup crew if we make web-platform-tests the true home of HTML parser tests.
I suspect @mfreed7 might be interested in this from the Chromium side. Copying here to gather interest.
I'm definitely supportive of the effort to clean this up, and make WPT the source of truth for parser tests.
Steps taken thus far:
I wonder if @zcorpan is still interested in taking this even further as I think it would definitely be preferable if we didn't have to go via html5lib-tests.
https://github.com/html5lib/html5lib-tests does have a number of actionable issues and stale PRs worth triaging. Help appreciated.
Yes. See https://github.com/html5lib/html5lib-tests/issues/127#issuecomment-1490501826 and later comments.
@zcorpan any progress on this?
Not yet but it's on my list.
This week I've done the exercise of updating HTML parser tests again, though this time I was a bit more successful in figuring out how to get those changes through to wpt (see #2887). But boy is it painful and also mostly undocumented!
html5lib-tests
(in the custom test data format) https://github.com/html5lib/html5lib-tests/pull/133html5lib-python
's submodule ofhtml5lib-tests
AND update.pytest.expect
(manually?) so that html5lib itself doesn't fail the changed tests without having them marked as expected failures. https://github.com/html5lib/html5lib-python/pull/531html5lib-python
in wpt'shtml/tools/build.sh
and generate tests inwpt
by runninghtml/tools/build.sh
. https://github.com/web-platform-tests/wpt/pull/27799Juggling 3 repos for one change like this doesn't seem ideal for contributors. From wpt's perspective, what I would like instead is:
wpt
and run a script to generate tests. No dependency on html5lib.Then
html5lib-python
can get the tree-builder test data fromwpt
instead of fromhtml5lib-tests
.Thoughts? @gsnedders @jgraham @annevk @stephenmcgruer