Closed xmo-odoo closed 6 years ago
Fragments parse fine already as far as I know.
@kovidgoyal it (logically) parses the fragment as an entire document:
fragment = b'a \n<b>foo</b><span>bar</span>\n \n'
p = parse(fragment)
print(p)
print(html.tostring(p))
results in
<Element html at 0x1089bc888>
b'<html><head></head><body>a \n<b>foo</b><span>bar</span>\n \n</body></html>'
which is not necessarily convenient especially when the incoming fragment might be an entire document, we've got to disambiguate between fragment-fragment (and keep only the body's content I guess) and document-fragment. I guess checking if the <head>
is completely empty might to the trick though.
When would you possibly want to parse something as a fragment or a document? Either you want fragments or you want documents, the two are incompatible. If you want your parsing to result in fragments, simply add a "<div>"
to the start of the string, parse it and return the first div from the parse tree. If you want your parsing to result in documents, you dont need to do anything.
The documentation only mentions a single method, but doesn't seem to say anything about fragments.
Is there a way to parse fragments (as non-document) with html5parser?