expath / xpath-ng

Wishlist for XPath Syntax Extensions
Creative Commons Attribution 4.0 International
12 stars 4 forks source link

Propose several CSS Selector style shorthands (first-child, last-child, etc.) #19

Open rhdunn opened 4 years ago

rhdunn commented 4 years ago

I've found myself wanting to write XPath expressions equivalent or similar to the CSS E:first-child selector. The two use cases I have so far is:

  1. first child element is a given element -- child::*[1]/self::element-name;
  2. first child node is a text node -- child::node()[1]/self::text().

These are harder to read in XPath as they are spanning multiple steps, so having shorthand selectors for these could be useful.

NOTE: I don't have any concrete proposals for this.

Using first-child::element-name and first-child::text() would be easier to read. The former is CSS-like (match elements only), while the latter is XPath-like (match any nodes). Using different context behaviour would be confusing, and inconsistent with the other XPath axis selectors. It would also prevent supporting a child::node()[1]/self::element-name equivalent.

Another possibility would be to create two separate axes, such as first-child and first-child-element. This would be consistent with axes like ancestor/ancestor-or-self, but could quickly increase the number of axes. This version would add at least 4 new axes:

  1. first-child:: (forward);
  2. first-child-element:: (forward);
  3. last-child:: (reverse);
  4. last-child-element:: (reverse).

NOTE: Only the tree traversal CSS selectors would be applicable. CSS selectors like E:hover require context information that is not available, nor relevant to XPath.

Reference: https://drafts.csswg.org/selectors-4/#overview

michaelhkay commented 4 years ago

Orthogonality is always a good design principle. If you want the first or last node on the child axis, then you might equally want the first or last node on any other axis, e.g. following-sibling or ancestor. (Of course we have parent as a special case for ancestor::*[1]; but let's not use that as a precedent).

If we don't like child::*[1]/self::name (and we don't), then we could allow

child::[1]name

or

child::[1]text()

or more generally

AxisStep ::= (ReverseA https://www.w3.org/TR/xpath-31/#doc-xpath31-ReverseStepxis | ForwardA https://www.w3.org/TR/xpath-31/#doc-xpath31-ForwardStepxis) PredicateList https://www.w3.org/TR/xpath-31/#doc-xpath31-PredicateList NodeTest

to indicate that the predicate is applied before the node-test, without any loss of orthogonality.

It looks unambiguous to me: "::" followed by "[" can't mean anything else.

Michael Kay

On 6 Apr 2020, at 17:51, Reece H. Dunn notifications@github.com wrote:

I've found myself wanting to write XPath expressions equivalent or similar to the CSS E:first-child selector. The two use cases I have so far is:

first child element is a given element -- child::*[1]/self::element-name; first child node is a text node -- child::node()[1]/self::text(). These are harder to read in XPath as they are spanning multiple steps, so having shorthand selectors for these could be useful.

NOTE: I don't have any concrete proposals for this.

Using first-child::element-name and first-child::text() would be easier to read. The former is CSS-like (match elements only), while the latter is XPath-like (match any nodes). Using different context behaviour would be confusing, and inconsistent with the other XPath axis selectors. It would also prevent supporting a child::node()[1]/self::element-name equivalent.

Another possibility would be to create two separate axes, such as first-child and first-child-element. This would be consistent with axes like ancestor/ancestor-or-self, but could quickly increase the number of axes. This version would add at least 4 new axes:

first-child:: (forward); first-child-element:: (forward); last-child:: (reverse); last-child-element:: (reverse). NOTE: Only the tree traversal CSS selectors would be applicable. CSS selectors like E:hover require context information that is not available, nor relevant to XPath.

Reference: https://drafts.csswg.org/selectors-4/#overview https://drafts.csswg.org/selectors-4/#overview — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/expath/xpath-ng/issues/19, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASIQIXT2QINV2DFKLE6RB3RLICAPANCNFSM4MCNZBSA.

rhdunn commented 4 years ago

I agree w.r.t. orthogonality.

Your proposed syntax (which I like) would work differently for the first example. There, it is matching the first element child (not general node) if it has the specified name, so would ignore text/comment nodes.

That is, it would sometimes be useful to apply axes on elements only instead of all children.

liamquin commented 4 years ago

i often see people leave out the * before a predictate, /a/b/[last()]orwhatever (meaning the last "b" element, or perhaps meaning the last child of b. So making it legal would turn a syntax error into a legal but potentially wrong expression.

Is *[1][self::span] so hard? Since child is the default we can already omit that part.