johnlumley / jwiXML

An iXML processor for JavaScript and SaxonJS
MIT License
13 stars 0 forks source link

Potential bug in XPath grammars - union operation in location step #8

Open wendellpiez opened 7 months ago

wendellpiez commented 7 months ago

The XPath grammars are awesome. I am not sure how to ensure their conformance abstractly (how to test them) - I'm thinking about that. Meantime I have found what appears to be a bug.

Using https://johnlumley.github.io/jwiXML.xhtml with either XPath grammar loaded (Full tree or Minimized tree), the expression

a//(b|c)//d

brings back a parsing error. So also a//(b| c)//d and a//(b |c)//d and a/(b|c).

Note that an equivalent XPath works as expected:

a//(b | c)//d

Since this grammar is a very useful input for me (I am having to subset XPath in iXML) I'd be grateful for info on how to repair it. (Likewise, I'll report if I can work it out for myself - since that's what the tool is for, right?)

johnlumley commented 7 months ago

Wendell (assuming it’s you - my XMLlist messages are always identified as from you ;-() - I’ll look into this over the next few days…. At a first glance the grammar doesn’t permit a UnionExpression as a StepExpression - may be a hangover from XPath 2 - but will have to check. Not sure what would happen if you try adding some sort of Union to StepExpression…..

John

Sent from my iPad

On 19 Jan 2024, at 17:47, Wendell Piez @.***> wrote:

 The XPath grammars are awesome. I am not sure how to ensure their conformance abstractly (how to test them) - I'm thinking about that. Meantime I have found what appears to be a bug.

Using https://johnlumley.github.io/jwiXML.xhtml with either XPath grammar loaded (Full tree or Minimized tree), the expression

a//(b|c)//d brings back a parsing error. So also a//(b| c)//d and a//(b |c)//d and a/(b|c).

Note that an equivalent XPath works as expected:

a//(b | c)//d Since this grammar is a very useful input for me (I am having to subset XPath in iXML) I'd be grateful for info on how to repair it. (Likewise, I'll report if I can work it out for myself - since that's what the tool is for, right?)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

wendellpiez commented 7 months ago

John yup it's me! Glad to offer you something to think about -- interesting thing to me is that when whitespace is found in the right places, it works. Makes me wonder if it's something about the '|' character alias for union.

wendellpiez commented 7 months ago

Looking at the XPath 3.1 EBNF it looks like the path to UnionExpr is through PostfixExpr and ParenthesizedExpr ... and sure enough, this also fails using either XPath iXML grammar under jwiXML:

(a|b)

... while (a | b) comes out fine. And same without the parentheses - a|b breaks, a | b is okay.