Open doriantaylor opened 5 years ago
@doriantaylor to be honest, I have no idea. This code is fairly old; when I began its first version (must be way more than 10 years ago…), minidom was in the tool for xml and, following the adage "ain't broken, don't fix it" I never really changed it. I cannot judge the difficulty.
One potential issue may be (but again it may not be…) whether there is a clear compatibility in the interface between the minidom used when parsing a pure XML content (say, an SVG file) and what is produced via the html5parser. I would be surprised if there was a difference, but this must be checked. Obviously, html5parser (which is an external dependency) plays an essential role.
I do not have any objection at all if you try. Mind you, this library is behind the RDFa distiller and parser service at W3C (which has a decent usage), so there has to be extra care in adopting any change…
It looks like html5lib
has an option to construct output with lxml.etree
, however my reading of graph_from_DOM
is that it's farther down the pipeline than that. One might be able to get away with a small proxy class that does a partial implementation:
lxml.etree._Element
DOMNodeProxy
I will take a look at what this entails. Maybe somebody has done it already?
Hey there,
Just tried to feed
graph_from_DOM
an already-parsedlxml.etree
document and I tripped over the fact that it only speaksxml.dom.minidom
. Since both these APIs give access to roughly the same information (at least as far as RDFa is concerned), I'd be okay with trying to make it handle both—unless it was too much of a snarl, or you didn't want it to for some reason.Thoughts?