DARIAH-ERIC / lexicalresources

Data space of the DARIAH Lexical Resources Working Group
https://dariah-eric.github.io/lexicalresources/
BSD 2-Clause "Simplified" License
18 stars 24 forks source link

Suggestion to section 9.1. Inheritance of xml:lang #131

Open daliboris opened 3 years ago

daliboris commented 3 years ago

You can also use XPath's lang() function (see here).

XPath expression

//orth[ancestor-or-self::*[@xml:lang][1][lang('en')]]

returns orthographic forms for all varieties of English, if they are defined, e.g. en, en-GB, en-US etc.

ttasovac commented 3 years ago

Nice! I like it very much. Will include it in the documentation.

daliboris commented 1 year ago

As it is written here:

A node's language is determined by its xml:lang attribute. If the current node does not have anxml:lang attribute, then the value of the xml:lang attribute of the nearest ancestor that has an xml:lang attribute will determine the current node's language. If the language cannot be determined (no ancestor has an xml:lang attribute), this function will return false.

With lang() function there is no need to search the ancestors. I.e. //orth[lang('en') is equivalent of this: //orth[ancestor-or-self::*[@xml:lang][1][@xml:lang='en']] or this: //orth[ancestor-or-self::*[@xml:lang][1][lang('en')]].

I recommend using only simplified XPath with lang() function in the section 10.1. Inheritance of xml:lang: //orth[lang('en')]. With the explanation that for searching 'sublanguages' one should use whole value, for example //orth[lang('en-GB)].

ttasovac commented 1 year ago

This is very nice, Boris. Could you actually create a pull request for this — against the dev branch. I would like, for pedagogical reasons, to keep both: first suggest the lang() function, and then also explain what it's equivalent to, so that people understand that lang() is really just a shorthand forxml:lang inheritance.

What was originally section 9.1 is now section 10.1: https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#index.xml-body.1_div.10_div.1

laurentromary commented 1 year ago

We need at least keep the reference to Path in the case when we want to retrieve the current value of @xml:lang since lang() is only testing a given value.