Closed r12a closed 3 years ago
The first comment in this issue contains text that will automatically appear in the Tamil gap-analysis document as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.
Closing in favour of https://github.com/w3c/iip/issues/117.
Tamil is an alphasyllabary in nature and Akshar is the writing system to write the language. It is very common in the Tamil script to represent/brake words based on Akshara, which is instinctively recognized by users of the script. The same requirement is given here.
As the W3C specification points to Unicode Text Segmentation (TR 29), it is observed that most of the browsers support it (e.g. Chrome and Firefox) when word is spaced by akshar. Also, in cases where there is wrong Akshara formation e.g. Consonant+Matra+Matra, the breaking seems to stack ill formed akshara into one set instead of clearly breaking it separate. This breaking behaviour needs to improve.