Open martindholmes opened 1 year ago
In the recent statement of the Japanese government below, the "、U+3001 for comma" and 。"U+3002 for period" are recommended even in horizontal text. So, if we don't need to mind consistency with the previous ones and it would be easy to change the existing marks into each one, it would be better to be done so. https://www.bunka.go.jp/seisaku/bunkashingikai/kokugo/hokoku/pdf/93651301_01.pdf
Thanks @knagasaki ! I can make the changes , and we could even add a bit of Schematron to the Guidelines schema when all the conversions are done.
Mixed with the regular ascii punctuation marks, we also have these:
U+FF0C FULLWIDTH COMMA U+FF0E FULLWIDTH FULL STOP
I'll normalize these to U+3001 and U+3002 too.
As of today, I've done all the normalizations I think are required. I'm going to keep this ticket open for the moment, because the Japanese translations are being updated steadily, and new instances are being introduced. This is something that can't easily be done with Schematron because there re contexts within the Japanese text where ascii commas and periods are actually fine (such as within numerals, or when discussing punctuation marks), so a quick pass before every new release is probably a good idea. I'll check again when the 4.7.0 freeze is in place.
Found a couple more and fixed them in commit cd39bc5d4. I'm now going to set this to the next milestone.
There's a discussion on https://github.com/TEIC/TEI/pull/2294 about whether Japanese punctuation marks ( 、U+3001 for comma, 。U+3002 for period) or western period and comma should be used in the Japanese text of the Guidelines. At the moment it's a mixture. Whichever we settle on, we just need to standardize throughout the Guidelines Japanese text. This is an easy fix, which I'm happy to work away at once the decision is made.