Add logging to indicate mismatch between HTML spec version and html dumps version #44

Open appledora opened 2 years ago

In GitLab by @geohci on Sep 20, 2022, 16:34

Our specific extraction logic is generally only correct for a given HTML spec -- e.g., HTML 2.5 changed how different filetypes are identified in the DOM. While most if not all things will be stable version-to-version (breaking changes should be rare), it would probably be good for our code to have a hard-coded parameter for what HTML spec it was built for that is compared to the HTML spec number in the article HTML to make sure they match (and maybe emits a warning message if there's a mismatch so folks know there may be errors).

appledora / mwparserfromhtml

Add logging to indicate mismatch between HTML spec version and html dumps version #44