Also looking at Nokogiri it uses Html/Xml to differentiate parsers. I assume a similar API would be nice so I propose following syntax:
Html::from_string(html).parse(&arena);
So basically I separated creating/parsing into two parts and absorbed Default into from_string. First you construct an object from string Html::from_string(html) and then your add arena and it will parse. I think this solution will allow to more easily change internal representation (i.e. what if we change tree to something that doesn't require an arena).
Input functions
For convenience I've added from_string, from_file methods to kuchiki, idea is that people can just pass a path to retrieve and parse a file. Possible future improvements - integrate hyper to retrieve a HTML page or XML file and parse it.
Serialization
I provided ToString implementation for Node<'a>, which will allow any node to be serialized with a simple node.to_string(), instead of writing out:
I added a css method to Node<'a> which will filter all descendants of said node and return an iterator, for further processing. It's a convenience method for replacing:
let document = ::parse(Some(html.into()), &arena, Default::default());
let selectors = ::selectors::parser::parse_author_origin_selector_list_from_str("p.foo").unwrap();
let matching = document.descendants()
.filter(|node| node.is_element() && ::selectors::matching::matches(&selectors, node, &None))
.collect::<Vec<_>>();
with following
let document = Html::from_string(html).parse(&arena);
let matching = document.css("p.foo").collect::<Vec<_>>();
Decided to add some convenience API for following things:
Parsing
It's tedious to write:
Also looking at Nokogiri it uses Html/Xml to differentiate parsers. I assume a similar API would be nice so I propose following syntax:
So basically I separated creating/parsing into two parts and absorbed Default into
from_string
. First you construct an object from stringHtml::from_string(html)
and then your add arena and it will parse. I think this solution will allow to more easily change internal representation (i.e. what if we change tree to something that doesn't require an arena).Input functions
For convenience I've added
from_string
,from_file
methods to kuchiki, idea is that people can just pass a path to retrieve and parse a file. Possible future improvements - integrate hyper to retrieve a HTML page or XML file and parse it.Serialization
I provided
ToString
implementation forNode<'a>
, which will allow any node to be serialized with a simplenode.to_string()
, instead of writing out:Selectors
I added a
css
method toNode<'a>
which will filter all descendants of said node and return an iterator, for further processing. It's a convenience method for replacing:with following