openscriptures / morphhb

Open Scriptures Hebrew Bible
https://hb.openscriptures.org
Other
178 stars 63 forks source link

Suggested use of the OSIS name element to enhance the OSHB project #100

Open DavidHaslam opened 11 months ago

DavidHaslam commented 11 months ago

For a semantically marked up Hebrew Bible project such as this Open Scriptures Hebrew Bible, it would be very useful to mark every proper noun with the OSIS XML name element and also include suitable type and subType attributes as may be deemed helpful for semantic purposes.

Biblical Hebrew is unicameral, so how else could one determine which words in the Hebrew Bible should be translated to a language that uses a bicameral writing system (or transliterated to Unicode Latin script using scholarly conventions) with an initial capital letter without first knowing whether they are a name?

jag3773 commented 11 months ago

I think we'd be happy to entertain a pull request that you submit that does exactly what you suggest. Of course, there are several places where it's unclear if a proper noun is indicated or not, but nevertheless, the majority of cases it would be very helpful to have this information encoded.

DavidHaslam commented 11 months ago

@jag3773

My own knowledge of Hebrew is not really up to such a task - one that involves linguistic as well as technical skills.

NB. For a bicameral writing system such as for the English Bible, it's feasible to identify names with almost 100% certainty using some very cunning Excel® formulae after importing a counted words list into a worksheet. In every language, there are exceptions, as there are some names that have an identical spelling to a common word. Just one example:

Almost 9 years ago, I developed such a technique for the CrossWire KJV. It's still on the roadmap for the KJV module, but has already been implemented in the DC books of the KJVA module.

cf. Earlier today, CrossWire released updated SWORD modules KJV version 3.1 and KJVA version 3.1

Also of relevance to the suggestion are the datasets developed at Tyndale House, Cambridge by Dr David Instone-Brewer and others, which are already being used in STEPBible.