Open frankier opened 9 years ago
+1 to use html5lib
I've found lxml2's html parser to be unable to handle any real-world HTML. However, I found html5lib has a habit of closing parent tags off early, causing the children become siblings. I personally found the inbuilt Python parser superior to html5lib.
It looks like libxml2's html parsing doesn't produce a proper html5 DOM and sometimes chokes on valid html5 even when run in tolerant mode which can result in errors like "XMLSyntaxError: ... Tag footer invalid". The solution is probably to allow the usage of html5lib instead. One hitch with this is the methods from HTMLMixin no longer exist, so the dependence on these should be removed from Splinter.