adyeths / u2o

USFM to OSIS bible format converter.
The Unlicense
18 stars 6 forks source link

Adding an osisRef to a footnote catchWord element? #33

Open DavidHaslam opened 6 years ago

DavidHaslam commented 6 years ago

Here's an example of a note in the source text of the KJV module:

<note type="study"><catchWord osisRef="1Chr.21.22@s[Grant]">Grant</catchWord>: <abbr expansion="Hebrew">Heb.</abbr> <rdg type="x-literal">Give</rdg>.</note>

NB. This is one of the very few source texts that CrossWire maintains.

Observe the optional use of an osisRef attribute for the catchWord element. See the extended syntax with @s[Grant] appended to the verse reference.

Now supposing we have a well-behaved USFM file containing a correctly positioned footnote and one which includes \fk _keyword_ : \ft ... where keyword comprises some text immediately before the footnote.

In theory, it should be feasible to implement this enhancement for the catchWord osisRef attribute.

And as you enjoy fun challenges, this fits nicely with your aptitudes.

Please note the following:

In USFM, the convention is to include the colon punctuation mark after the keyword. That's merely a convention. It's not a syntactic requirement. The colon may be optionally preceded by a space. NB. Some writing systems (with non-Roman scripts) may use a different character than an ordinary colon.

However, in the OSIS example from the KJV, you will see that the colon has been moved to after the end of the catchWord element. This made sense when I automatically scripted the addition of these osisRef attributes, because it meant that the @s[_keyword_] extension (if ever implemented in SWORD) would be able to find it in the text, seeing as the colon is not part of the text.

adyeths commented 6 years ago

Is this a part of osis 2.1.1. or is it a crosswire specific extension?

DavidHaslam commented 6 years ago

It's not a requirement of OSIS but it is defined by OSIS.

Page 148 of the OSIS 2.1 User Manual

The s grain operator is a string, enclosed by square brackets and preceded by the "@" sign, all of which follows, at a mimimum, the main part of an osisRef. For example:

  • RSV:Gen.1.1@s[beginning] Points at the starting character of the word "beginning."
  • RSV:Gen.3.20@s[Eve] Points to the starting character of the word "Eve."

You may wish to convince yourself that the s operator is easier to use than cp but to each his own.

Warning: Note that the s operator does not allow spaces. That is to say that you cannot put a phrase between the square brackets. That limitation is due to the handling of spaces in XML. It was an issue that the editors struggled with for some time but ultimately, it was decided that word level matching would meet most users needs.