Closed Rolf-Smit closed 3 years ago
Hello @Rolf-Smit ,
Thank you for posting your observation. I believe you are correct - the comma should not be in this regex, and would not occur in an sid,eid,or loc. The example you quoted about comma delimited verses is good support - the separate components being broken apart. I cannot recall a reason for the comma being included, and unfortunately the history in this repository does not go back prior to the USX 2.6 schema.
A correction has been posted to the schema and the docs.
Jeff
@klassenjm thanks for fixing this! But I think I found some more inconsistencies/issues.
One sample shown here: https://ubsicap.github.io/usx/master/elements.html#ref is this one:
<ref loc="MAT-LUK">Mt—Lk</ref>
However the Regex does not seem to allow for book ranges:
[A-Z1-4]{3} ?[a-z0-9\-:]*
Linking Attributes in USFM as described here: https://ubsicap.github.io/usfm/linking/index.html#general-syntax do also not seem to allow for book ranges, and the Regex there still contains the comma: [A-Z1-4]{3} ?[a-z0-9\-,:]*
and same thing is true for the /xt
marker in USFM: https://ubsicap.github.io/usfm/notes_basic/xrefs.html#xt. The link-href
default attribute that can be used for the /xt
tag also uses the Regex that includes a comma: [A-Z1-4]{3} ?[a-z0-9\-,:]*
.
TLDR:
ref
element? And what about support for book ranges in the /xt
link-href
attribute?@Rolf-Smit Thank you very much for sending these notes. It is appreciated. I will review and make changes as needed, as soon as possible.
Moving USFM specific items.
Looking at the documentation the Regex used for chapter and verse ID's, and also the loc attribute found in References uses the following Regex group to parse the part after the Paratext Book ID:
[a-z0-9\-,:]
I assume the dash
-
and double colon:
are to separate verses from chapters etc, because the documentation and samples clearly use those. But how is the comma,
used? I don't see any example of a verse or chapter ID that uses a comma. I also can't seem to find any example of a verse ID that is non-decimal and usesa-z
.Edit: It even seems that at least for the Reference loc attribute comma's should be avoided/removed, according to the documentation: https://ubsicap.github.io/usx/elements.html#ref
I assume this is only the case for references since multiple can easily be added? So comma's are allowed in chapter and verse ID's?
Would be really nice if some samples could be added!