Support XPath selectors

devseppala commented 9 months ago

I just recently learned about htmx as it has gained much attention over the web. I immediately liked the concept as it feels like a move to a more sane direction in web development. However, as I quickly learned that htmx does not support XPath, I was a little dissapointed. XPath support was originally included in browsers with ajax based content swaps in mind, just what htmx does and it is a pity that htmx supports only CSS selectors.

After a little thinking, I thought that it would be a nice way to try to learn htmx by trying to implement support for XPath selectors. So I forked htmx to my own repository https://github.com/devseppala/htmxpath , started hacking and after a while got XPath selectors working. The following is a description of my XPath implementation, so it could be considered for a possible inclusion to htmx.

By using "!xpath:" prefix in front of the selector, the implementation allows XPath to be used in hx-select , hx-target and hx-swap-oob attributes. For example, hx-select="!xpath:/html/body/h1[2]" , selects the second h1 heading under the body element. I ran the test suite locally (npm run test) and 674 test passed (1 pending?). Passed tests include some new XPath specific cases.

Implementation details:

Selected !xpath: prefix to separate XPath selectors from CSS selectors
- I wanted a prefix that could newer be a start of a valid CSS selector.
Created a couple of functions for dealing with XPath queries.
- isXPathSelector(selector)
- getXPathSelector(selector)
- xpathResult(eltOrSelector, xpathSelector)
- xpathSingle(eltOrSelector, xpathSelector)
- xpathArray(eltOrSelector, xpathSelector)
  - xpathArray function converts XPathResult to Node array
  - Standard way to search elements using CSS selectors is to use querySelectorAll(), which returns an NodeList object. Array has the same accessor functions as NodeList, so it can be use interchangeably.
Modified find() and findAll() functions to recognize Xpath selectors with !xpath: prefix
Changed couple of direct querySelectorAll() usages to go through the find/findAll functions instead and thus properly process xpath selectors.

Some issues for consideration:

Is !xpath: prefix the best choice to identify XPath expressions?
Implementation is not IE11 compliant, but this should not be a problem for htmx 2.0
Is XPathResult.UNORDERED_NODE_ITERATOR_TYPE best choice for XPathResult type, should it be ORDERD or SNAPSHOT ?

I would like to add that, that I am not really a web developer, nor have I built anything using htmx, My background is more in developing a Java, XML and XSLT based documentation system, where I have come to accustomed in using XPath. So, feel free to question my coding choices and to suggest improvements.

Below is a diff that highlights the changes that I have made to htmx.js 1.9.9 . As you see, this implementation does not add all that much new code to the library. I guess I could even trim away a few lines if needed.

@@ -491,7 +491,8 @@ return (function () {

         function find(eltOrSelector, selector) {
             if (selector) {
-                return eltOrSelector.querySelector(selector);
+                var xpathSelector = getXPathSelector(selector);
+                return (xpathSelector ? xPathSingle(eltOrSelector, xpathSelector) : eltOrSelector.querySelector(selector));
             } else {
                 return find(getDocument(), eltOrSelector);
             }
@@ -499,7 +500,8 @@ return (function () {

         function findAll(eltOrSelector, selector) {
             if (selector) {
-                return eltOrSelector.querySelectorAll(selector);
+                var xpathSelector = getXPathSelector(selector);
+                return (xpathSelector ? xpathArray(eltOrSelector, xpathSelector) : eltOrSelector.querySelectorAll(selector));
             } else {
                 return findAll(getDocument(), eltOrSelector);
             }
@@ -593,6 +595,38 @@ return (function () {
             }
         }

+        function isXPathSelector(selector) {
+            return selector.toString().startsWith("!xpath:");
+            //return typeof a_string === 'string' && selector.startsWith("!xpath:");
+        }
+
+        function getXPathSelector(selector) {
+           if(selector.startsWith("!xpath:")) return selector.substr(7);
+            return;
+        }
+
+        function xpathResult(eltOrSelector, xpathSelector) {
+            if (xpathSelector) {
+                var evaluator = new XPathEvaluator();
+                return evaluator.evaluate(xpathSelector, eltOrSelector, null,  XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null);
+            } else {
+                return xpathResult(getDocument(), eltOrSelector);
+            }
+        }
+
+        function xpathSingle(eltOrSelector, xpathSelector) {
+            return xpathResult(eltOrSelector, xpathSelector).iterateNext();
+        }
+
+        function xpathArray(eltOrSelector, xpathSelector) {
+            var arr = [];
+            var xPathResult = xpathResult(eltOrSelector, xpathSelector);
+            for (let result = xPathResult.iterateNext(); result; result = xPathResult.iterateNext()) {
+                arr.push(result);
+            }
+            return arr;
+        }
+
         function querySelectorAllExt(elt, selector) {
             if (selector.indexOf("closest ") === 0) {
                 return [closest(elt, normalizeSelector(selector.substr(8)))];
@@ -613,7 +647,8 @@ return (function () {
             } else if (selector === 'body') {
                 return [document.body];
             } else {
-                return getDocument().querySelectorAll(normalizeSelector(selector));
+                if( isXPathSelector(selector)) return findAll(elt, normalizeSelector(selector));
+                return findAll(normalizeSelector(selector));
             }
         }

@@ -790,7 +825,7 @@ return (function () {
                 swapStyle = oobValue;
             }

-            var targets = getDocument().querySelectorAll(selector);
+            var targets = findAll(selector);
             if (targets) {
                 forEach(
                     targets,
@@ -1037,7 +1072,7 @@ return (function () {
             var selector = selectOverride || getClosestAttributeValue(elt, "hx-select");
             if (selector) {
                 var newFragment = getDocument().createDocumentFragment();
-                forEach(fragment.querySelectorAll(selector), function (node) {
+                forEach(findAll(fragment, selector), function (node) {
                     newFragment.appendChild(node);
                 });
                 fragment = newFragment;

svenberkvens commented 9 months ago

It looks like a useful addition to HTMX. The easier it is for people to use what they already know, the better. Especially with this amount of change to the source code.

Delapouite commented 9 months ago

Here's an interesting thread on the WHATWG DOM repo about the recent state of affair about XPath integration in web browsers: https://github.com/whatwg/dom/issues/903

bigskysoftware / htmx

Support XPath selectors #2113