SASDigitalHumanitiesTraining / TextEncoding

Text Encoding for Ancient and Modern Literature, Languages and History
9 stars 5 forks source link

Operator question, "|" vs. "or" #28

Open bitparity opened 2 years ago

bitparity commented 2 years ago

How come //div/div/(preceding-sibling::* | following-sibling::*) uses | but not or (i.e. it returns an error if you try to use or) whereas //sp[(@who = "Hamlet") or (@who = "Ophelia")] is the opposite, i.e. uses or but not |?

cmohge1 commented 2 years ago

Long answer and short answer. Short answer is that your example

//div/div/(preceding-sibling::* | following-sibling::*)

has slightly different syntax. It does not have a predicate within [ ]. But

//div/div[preceding-sibling::div | following-sibling::div]

should work (it has a pedicate in square brackets).

Longer answer. Strictly speaking, a | computes two node sets (so it is a true combine function).

If you use or in an XPath expression it thinks you're doing a Boolean test. So if I try

//lg or //ab

XPath returns 'true' because the file has both lgs and abs.

With

//sp[(@who = "Hamlet") or (@who = "Ophelia")]

you have an expression that is equivalent to

//sp[@who = "Hamlet" or @who = "Ophelia"]

The 'or' within a predicate allows you to combine predicate conditions. Which kind of makes sense because the definition of the 'or' operator means 'any one of the condition to be satisfied'. The syntax within a predicate versus a path expression have slightly different rules. I hope that makes sense. Gabby might have a better explanation.

This exchange on Stack Overflow might help, too. https://stackoverflow.com/questions/23990923/xpath-multiple-and-or-operators

gabrielbodard commented 2 years ago

I don't have a clearer answer, in fact I hadn't thought through my understanding or | vs OR fully until now, but the explanation that OR is a boolean operator (i.e. returns true or false and therefore can only be used in predicates or tests, not to select nodes) makes the most sense to me.