kuchiki-rs / kuchiki

(朽木) HTML/XML tree manipulation library for Rust
MIT License
470 stars 54 forks source link

parser: Allow callers to create their own sinks #87

Closed kkuehlz closed 3 years ago

kkuehlz commented 3 years ago

This allows users to call parse_document with their own sink as follows:

let sink = Sink::default();
let parser = html5ever::parse_document(sink, ParseOpts::default())
...
kkuehlz commented 3 years ago

I can't merge this (permissions). If it looks good to you, feel free to send it.

Ygg01 commented 3 years ago

Looks ok, but I'm a bit vary exposing this to the public. Would like it better if there were a few examples of usage.

kkuehlz commented 3 years ago

It's mostly just API parity with the RcDom. I'm moving a project from RcDom to your crate, and you are able to create an RcDom directly. This allowed us to extend the dom and do some work during the parse pass, then call the underlying method. Here is a code example if that helps. Your call though. I know this library makes no guarantees about API compatability with RcDom.

I'm happy to construct a small example to go in this lib if that's what you are looking for?

SimonSapin commented 3 years ago

Not exposing html5ever in the public API of kuchiki was intentional, in order to minimize the API surface covered by SemVer.

Do you have a custom Sink that wraps kuchiki’s? My first reaction is that at this level of desired customization you might also want some day have your own tree data structure, since representing cyclic data in Rust is so full of trade-offs. There’s not a lot of logic in kuchiki itself (as opposed to in its dependencies)

kkuehlz commented 3 years ago

Yeah we have a custom sink wrapper. Linked in my previous comment. Anyway, if this was a design made deliberately, I don’t think you should change your API for my use case. We can just close this out.