ubsicap / usfm

Unified Standard Format Markers
39 stars 18 forks source link

Glossary word markers and visibly tagging the marked word #54

Open DavidHaslam opened 7 years ago

DavidHaslam commented 7 years ago

This is about glossary word markers:

Some translators like to tag the marked words with a symbol (often an asterisk) to visibly indicate to the Bible reader that the tagged word may be found in the respective glossary.

Now that USFM 3.0 has the extended syntax capabilities, it should be feasible to implement the special symbol as an attribute of the glossary word marker, rather than leaving it as part of the Biblical text. e.g..

\w gracious|lemma="grace" strong="G05485"|marker="†"\w*

In this example, I used the dagger symbol (U+2020).

@klassenjm – please consider and review. Thanks.

DavidHaslam commented 7 years ago

This proposal would ensure that the asterisks in the USFM files are all placed within markup.

There would no more matches to ** in words aforetimes marked like this: \w gracious\w**.

klassenjm commented 7 years ago

@DavidHaslam I believe the syntax would need to be (no additional pipe before the additional attribute):

\w gracious|lemma="grace" strong="G05485" marker="†"\w*

Also - I would recommend "caller" as the attribute name (since people are familiar with that term for note callers)

\w gracious|lemma="grace" strong="G05485" caller="†"\w*

What I am expecting to happen is that we will gather a list of additional attributes which users identify as broadly useful, and these can be added to a 3.1 specification. For 3.0, I'm quite sure that I will not be able to add this to what the PT team has done for Paratext 8.1 already to support USFM 3.0. (I will look into it though.) Something which does not end up in an official release could still be initially used in a text using x-, like:

\w gracious|lemma="grace" strong="G05485" x-caller="†"\w*
DavidHaslam commented 7 years ago

Thanks.

Something that might be accepted in principle before USFM 3.1 release would be very good.

I recognise that with new suggestions or proposals, it can take time for everyone to recognise their usefulness.

Whether the attribute name should be marker or caller is immaterial.

cf. OSIS 2.1.1 already has one called marker but as long as conversion scripts know of the equivalence, things should be fine.

User defined attribute names with the x- prefix are well-established.

mhosken commented 3 years ago

This is typically handled by typesetting software that styles glossary items. I don't think the styling needs to be capture in the USFM itself, unless it is irregular and the irregularity needs to be captured.