CherokeeLanguage / CherokeeNewTestament.org

The source code for the website https://cherokeenewtestament.org/
Creative Commons Zero v1.0 Universal
2 stars 0 forks source link

Quotation marks and typographical apostrophes #9

Open DavidHaslam opened 3 years ago

DavidHaslam commented 3 years ago

The electronic text currently uses characters \x22 and \x27 for the quotation marks and typographical apostrophes originaly transcribed from the printed work.

It's apparent that the original work used proper typographical left and right double quotation marks as well as the typographical apostrophe.

The electronic text can be therefore improved by the use of the following Unicode characters:

The former occur as 51 matched pairs.

The latter occurs in only 10 locations, one of which is just before a Cherokee word.

These 3 characters are supported by the Galvji font, albeit the left and right double quotation marks are very similar in shape.

cf. In many other Unicode fonts, they have a different shape from each other.

DavidHaslam commented 3 years ago

Strictly speaking, all the 322 apostrophes used for possessives in the English literal translation should also use U+2019 rather than U+0027.

michael-conrad commented 3 years ago

The electronic text currently uses characters \x22 and \x27 for the quotation marks and typographical apostrophes originaly transcribed from the printed work.

It's apparent that the original work used proper typographical left and right double quotation marks as well as the typographical apostrophe.

The electronic text can be therefore improved by the use of the following Unicode characters:

* U+201C LEFT DOUBLE QUOTATION MARK

* U+201D RIGHT DOUBLE QUOTATION MARK

* U+2019 RIGHT SINGLE QUOTATION MARK as the preferred character to use for the apostrophe

The former occur as 51 matched pairs.

The latter occurs in only 10 locations, one of which is just before a Cherokee word.

These 3 characters are supported by the Galvji font, albeit the left and right double quotation marks are very similar in shape.

cf. In many other Unicode fonts, they have a different shape from each other.

Currently looking at figuring out a way to fix up the text with the case marked version. Which we treat as the master copy. Not sure how this will impact these issues.

A copy of the case marked version has been added in the folder "reference".