ubsicap / usx

Unified Scripture XML
30 stars 6 forks source link

Correct regex for @number attributes in documentation #29

Closed klassenjm closed 6 years ago

klassenjm commented 6 years ago

On Tue, 24 Oct 2017 at 11:42 Chris Hubbard chris_hubbard@sil.org wrote:

Hello,

I was referring to the USX documentation for the <verse> element and number attribute recently for an issue with Scripture App Builder. It was not correctly parsing:

<para>
  <verse number=“2-6a”>

I looked at the documentation and the Regular Expression for the number attribute and it isn’t a valid Regular Expression:

https://app.thedigitalbiblelibrary.org/static/docs/usx/elements.html#verse

@number: Current verse number (sequential; according to versification definition for the scripture text). *

xsd:string of pattern [0-9]+\w?(‏?[\-,][0-9]+\w?)*

I talked to Tim Steenwyk about it. It turns out the code as a non-visible RTL character that when copy/pasted show like above. He has changed to code to have the visible unicode sequence. It would help implementors of USX to have this in the documentation as well. Here is the correct Regular Expression (Tim, please correct me if wrong!!)

[0-9]+\w?(\u200F?[\-,][0-9]+\w?)*

There could be several places in the documentation where this Regular Expression is used (I saw @altnumber just below @number).

Thank you for your attention to this. :-)

Grace & peace to you,

Chris

ericpyle commented 6 years ago

@klassenjm is this a request to change the USX.rng?

klassenjm commented 6 years ago

No, I think just a bug in the documentation. The docs have a right to left space in the @number regex string (in the docs). If you look at the string, it appears like a bad regex, but it's just because the invisible \u200f is causing a direction change for some characters.