metamath / metamath-knife

Metamath-knife can rapidly verify Metamath proofs, providing strong confidence that the proofs are correct.
Apache License 2.0
26 stars 11 forks source link

HTML validation in verify markup #157

Closed digama0 closed 4 months ago

digama0 commented 5 months ago

This uses the html5ever crate to parse HTML in <HTML> blocks and report any errors. Unfortunately, the error messages are kind of garbage (no positional information), so the best we can do is point at the containing block. To make the results more actionable, I manually copied the text from offending theorems to the validator, which has better error messages.

Since it's a bit of a chunky dependency, it has been added as an optional dependency (it's possible to compile without it in which case you don't get the verify_markup function).

tirix commented 5 months ago

I've relaunched the tests but there still seems to be one error with the HTML marking of the "conventions" page.

tirix commented 4 months ago

With metamath/set.mm#3932 and metamath/set.mm#3948, the CI now passes, let's merge.

jkingdon commented 4 months ago

The change to validate HTML is a good one.

But do our processes (manual and/or automated) only check set.mm (rather than iset.mm)? I had to make a fix to the one of the iset.mm HTML blocks to get a passing build: https://github.com/metamath/set.mm/pull/4031 specifically the https://github.com/metamath/set.mm/pull/4031/commits/8136a89334cbf925473a1ffad2c8856a14c20109 commit.

tirix commented 4 months ago

Sorry I should have sent notification when merging this, I thought metamath/set.mm/pull/3932 and metamath/set.mm/pull/3948 would cover the whole set.mm CI.

But do our processes (manual and/or automated) only check set.mm (rather than iset.mm)?

You can see in the verifiers.yml file that metamath-knife is used to check both set.mm and iset.mm, but no other database. I don't see any check in the CI for the other databases (like hol.mm).

I think what happened is that when testing out this new HTML validation, Mario only checked set.mm, saw the many errors, proceeded to fix them, but did not cover iset.mm at that time.

jkingdon commented 4 months ago

But do our processes (manual and/or automated) only check set.mm (rather than iset.mm)?

You can see in the verifiers.yml file that metamath-knife is used to check both set.mm and iset.mm

Yeah, this is how I found the problem in iset.mm. But it couldn't be detected before merging the metamath-knife change, because it only runs on pull requests in the set.mm repository, not pull requests in the metamath-knife repository.

I think what happened is that when testing out this new HTML validation, Mario only checked set.mm, saw the many errors, proceeded to fix them, but did not cover iset.mm at that time.

Yup, that sounds like what happened. If this only happens occasionally I don't have a problem with just handling it in a bit manual and/or ad hoc way, we we did here.

It is only if we seems to have a more frequent issue (like we used to have for set.mm rewrapping, for example) that I get more motivated to think in terms of what could have prevented it.