Closed Skarsnik closed 7 years ago
This handles tag errors just fine. It sounds like you want the supercedes
directive in perl6 - if you open this on HTML::Scraper
then I'll make it interchangeable as long as Gunbound
or whatever conforms to the same interface
Don't get me wrong. I don't want to replace h:p:x with Gumbo (since it require an external lib) I just want a way for people/module writer that need to parse html could have a common place to look at. But the implementation could be selected if an user need specific need (in my case faster parsing)
I am not sure to understand how supercedes work for what I want.
@Skarsnik supercedes
is NYI but you'd essentially do class Gumbo supercedes HTML::Parser::XML
and then wherever there is use HTML::Parser::XML
and they have Gumbo
installed, Gumbo
would be used in H:P:X
's place. It's NYI. I'll write a role for HTML -> XML parsing and if you open this issue on Web::Scraper
, I'll modify Web::Scraper
to use that instead (this is the right way to make this work).
I don't really know how to formulate this but I wrote a Gumbo binding https://github.com/Skarsnik/perl6-gumbo It's a robust HTML5 parsing lib that handle tag error like defined in the spec.
I realise the two modules provide the same thing: You give a html string and it give a XML::Document.
So maybe we can create a common module like Service::Parse::HTML that use h:p:x by default (since it's only native perl6 code) but if the user write use Gumbo; before. Gumbo can tell the module to use his own implementation (or another module)
I was thinking of this because I wanted a way to trick module like HTML::Restrict or HTML::Scrapper that use h:p:x to use Gumbo instead without touching their code.