Closed funkyfuture closed 3 years ago
@JKatzwinkel that covers your cases, doesn't? is the method's docstring sufficiently comprehendible?
Looks really great, method docstring makes sense, I will give it a try later, thanks!
thanks for the additional tests.
my more "correct" (less hackish, really) implementation of the predicates now only fails with expressions like ./cit[@type="translation"][@lang="en"]
that are first transformed to ./entry/sense/cit[(@type="translation") and (@lang="en")]
, but bracket parsing isn't implemented. currently i'd say that this doesn't need to be part of a future, delb-native implementation of a subset of the XPath-grammar. hence that initial normalization could yield a simpler expression. but then, there's the risk that someone really throws in something like cit[@foo or @bar][@peng="pow"]
. and that then incorrectly transformed expression trickles through to lxml's xpath evaluator.
i already tend to a Verbot of overly complicated XPath expressions for now. and maybe one should aim for a first spec of a supported subset of XPath in the 0.4
release? @03b8, @JKatzwinkel, @zed-g, do you have any thoughts on that?
while elaborating the issues with @JKatzwinkel we concluded:
0.3
release(EXPR) and (EXPR)…
, its handling is simple enough to keep it supportedoh, and btw, here's an attempt to implement XPath with a generic parser: https://github.com/emory-libraries/eulxml/tree/master/eulxml/xpath
it certainly could use more tests, but parenthesized expressions should now be parsed properly.
but yes, functions now pose an unsolved problem. me currently thinks that only functions where an attribute makes sense as first argument and that return a boolean are simple enough to be supported. that are starts-with
and contains
. and not
makes sense.
While adding arbitrary assertions to test_quotes_in_css_selector
, I realized that any specification of an xpath subset we end up supporting would require us to also specify what parts of css we do and don't support... For instance, at this point, even after "implementation" of boolean xpath functions "support", we do not support the css selector [attribute$=value]
, because the css library in use doesn't convert $=
into xpath's ends-with()
function, but rather into an expression involving substring()
, the return type of which isn't boolean.
there's no 'native' ends-with
function in the XPath 1.0 specs, but it should be specified b/c of consistency w/ starts-with
.
that won't help with the css selectors though. maybe we'd just vendorize an adapted version of the transforming lib. and then there's the long term question how it would be done in Rust. my guess is that seperate parsers and evaluators for CSS and XPath are the simplest thing to do.
thanks a lot for the valuable input. i wanted to put an end to this PR before it grew endlessly.
It's surprisingly simple to implement an XPath parser. :grimacing: