TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
269 stars 88 forks source link

Need to standardize punctuation marks in Japanese translations #2337

Open martindholmes opened 1 year ago

martindholmes commented 1 year ago

There's a discussion on https://github.com/TEIC/TEI/pull/2294 about whether Japanese punctuation marks ( 、U+3001 for comma, 。U+3002 for period) or western period and comma should be used in the Japanese text of the Guidelines. At the moment it's a mixture. Whichever we settle on, we just need to standardize throughout the Guidelines Japanese text. This is an easy fix, which I'm happy to work away at once the decision is made.

knagasaki commented 1 year ago

In the recent statement of the Japanese government below, the "、U+3001 for comma" and 。"U+3002 for period" are recommended even in horizontal text. So, if we don't need to mind consistency with the previous ones and it would be easy to change the existing marks into each one, it would be better to be done so. https://www.bunka.go.jp/seisaku/bunkashingikai/kokugo/hokoku/pdf/93651301_01.pdf

martindholmes commented 1 year ago

Thanks @knagasaki ! I can make the changes , and we could even add a bit of Schematron to the Guidelines schema when all the conversions are done.

martindholmes commented 1 year ago

Mixed with the regular ascii punctuation marks, we also have these:

U+FF0C FULLWIDTH COMMA U+FF0E FULLWIDTH FULL STOP

I'll normalize these to U+3001 and U+3002 too.

martindholmes commented 8 months ago

As of today, I've done all the normalizations I think are required. I'm going to keep this ticket open for the moment, because the Japanese translations are being updated steadily, and new instances are being introduced. This is something that can't easily be done with Schematron because there re contexts within the Japanese text where ascii commas and periods are actually fine (such as within numerals, or when discussing punctuation marks), so a quick pass before every new release is probably a good idea. I'll check again when the 4.7.0 freeze is in place.

martindholmes commented 7 months ago

Found a couple more and fixed them in commit cd39bc5d4. I'm now going to set this to the next milestone.