w3c / alreq

Documenting gaps and requirements for support of Arabic and Persian on the Web and in eBooks.
Other
60 stars 31 forks source link

Characters that are not used for Arabic/Persian #277

Open xfq opened 1 month ago

xfq commented 1 month ago

https://www.w3.org/TR/alreq/#h_character_tables_punctuation_and_symbols

There are some characters that are not used for Arabic (like U+0020 SPACE and U+002A ASTERISK), and some characters that are not used for Persian (like U+0022 QUOTATION MARK). I wonder what the criteria are for selecting these characters?

shervinafshar commented 1 month ago

For Persian, there is a standard—ISIRI-9147, pp. 17-19 of PDF—available. For Arabic, we couldn't surface such document and if I recall it correctly, we relied on CLDR data and the case of U+0020 for Arabic seems to be an error. We probably need to revisit this section for Arabic.

Also, if you were unaware, we provisionally recorded our non-normative references in a spreadsheet here with the objective of migration the content eventually to the document. I added #278.