ubsicap / usfm

Unified Standard Format Markers
39 stars 18 forks source link

Clarify use of 'strong' attribute when word repetition occurs in Hebrew or Greek #72

Closed DavidHaslam closed 6 years ago

DavidHaslam commented 6 years ago

There are many instances where more than one word in the target language needs to be tagged because of how Hebrew and Greek use word repetition for (e.g.) a superlative. Here's a simple example:

\w most holy|strong="H06944, H06944"\w*

Because the Hebrew in (e.g.) Exodus 29:37 repeats the word 'holy' as the superlative form.

In OSIS XML this becomes:

<w lemma="strong:H06944 strong:H06944">most holy</w>

In USFM, it would be incorrect to separately tag the words 'most' & 'holy'.

It may be therefore useful to include an example in the documentation where more than one word is tagged for Strong's numbers like this.

Other similar situations are where the idioms in the source and target languages do not match, such that you need to tag a whole phrase, and include several Strong's numbers as attributes.

DavidHaslam commented 6 years ago

Aside: The only reason I had 5 digits in the Strong's numbers is because I pasted them from an existing work.