atifaziz / Fizzler

.NET CSS Selector Engine
Other
133 stars 29 forks source link

Unknown pseudo-class root #68

Open sbrl opened 5 years ago

sbrl commented 5 years ago

Hey!

I've got a particular use-case whereby I sometimes need to select against the :root element, but Fizzler doesn't appear to support it:

System.AggregateException: One or more errors occurred. (Unknown pseudo-class 'root'. Use either first-child, last-child, only-child or empty.) ---> System.FormatException: Unknown pseudo-class 'root'. Use either first-child, last-child, only-child or empty.
  at Fizzler.Parser.PseudoClass () [0x000a2] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Pseudo () [0x00000] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.SimpleSelectorSequence () [0x000bf] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Selector () [0x0000b] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.SelectorGroup () [0x00000] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Parse () [0x0000b] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Parse[TGenerator,T] (System.Collections.Generic.IEnumerable`1[T] tokens, TGenerator generator, System.Func`2[T,TResult] resultor) [0x00032] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Parse[TGenerator,T] (System.String selectors, TGenerator generator, System.Func`2[T,TResult] resultor) [0x00028] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Parser.Parse[TGenerator] (System.String selectors, TGenerator generator) [0x00000] in <a9e51db5ce7a4a299b29e9b06a6b709c>:0
  at Fizzler.Systems.HtmlAgilityPack.HtmlNodeSelection.Compile (System.String selector) [0x0001f] in <1deeb267dd9748e08600ff9868bd5266>:0
  at Fizzler.Systems.HtmlAgilityPack.HtmlNodeSelection.QuerySelectorAll (HtmlAgilityPack.HtmlNode node, System.String selector, System.Func`2[T,TResult] compiler) [0x00021] in <1deeb267dd9748e08600ff9868bd5266>:0
  at Fizzler.Systems.HtmlAgilityPack.HtmlNodeSelection.QuerySelectorAll (HtmlAgilityPack.HtmlNode node, System.String selector) [0x00000] in <1deeb267dd9748e08600ff9868bd5266>:0
  at Fizzler.Systems.HtmlAgilityPack.HtmlNodeSelection.QuerySelector (HtmlAgilityPack.HtmlNode node, System.String selector) [0x00000] in <1deeb267dd9748e08600ff9868bd5266>:0
.....
atifaziz commented 5 years ago

I am afraid, it's not supported at the moment.

I am guessing your use case is such that your root element is not <html> and so you cannot select it with the type selector html?

sbrl commented 5 years ago

Ah, I suspected as much.

Something like that, yeah.

I'm actually writing an Atom feed generator (PolyFeed!), and in the configuration file I have an option to select the HTML elements that should form the basis of the items in the Atom feed.

For each element there are then a number of other different selectors to match against the elements that contain various pieces of information (e.g. the author's name, for example), which I have separate options to specify the attribute on the selected element to extract the information from. I want to be able to hit against the 'root' element here, as I'm making these secondary matches with a subNode.QuerySelector("selector here").

In theory, I could do this by testing if(selector == ":root") return htmlNode else return htmlNode.QuerySelector(selector);, but it'd be great if I didn't have to wrap it :P

atifaziz commented 5 years ago

Thanks for the detailed explanation, @sbrl, and I think I see where you are going with this.

While adding support for :root is not particularly difficult, it is designed to only work in a very specific scenario that I am not sure will help your case. You see, a *:root query doesn't make sense or doesn't work unless it is started from the document node, which is hierarchically above the root element. In other words, you can't get to the root element from a query on the child element, which is what I think you are trying to get at. So piggy-backing on :root in the way you want (that is, if (selector == ":root") return htmlNode else return htmlNode.QuerySelector(selector)) would have to be an enhancement specific to your application anyhow.

sbrl commented 5 years ago

Ah, I see. Looks like it doesn't work in the way I thought it would. I didn't realise that :root is relative to the top-level document node and not the child element from which you're running the query.

I'll leave this issue open as a feature request anyway though.