w3c / iip

Documenting gaps and requirements for support of Indic languages on the Web and in eBooks.
https://w3c.github.io/iip/
8 stars 15 forks source link

Space toggle switch may be useful for Gurmukhi #98

Open r12a opened 4 years ago

r12a commented 4 years ago

Certain older Gurmukhi texts do not contain space characters. This form is known as Larivaar and some website allow you to toggle it as a setting. It might make sense for this to be a transform property on the text, so that way we can preserve the words for line-breaking.

r12a commented 4 years ago

The first comment in this issue contains text that will automatically appear in the Gurmukhi gap-analysis document as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

NorbertLindenberg commented 4 years ago

Is the expected behavior then that line breaks still occur at word boundaries, or can they occur at any aksara boundary?

vivekpani commented 4 years ago

Order of priority Word boundary, Sandhi vichchhed, Akshara boundary.

Sandhi is not even considered anywhere in the standards for Indic languages. Moreover, some oversimplification efforts that are limited only to the most popular Devanagari script has also got a lot of akshara conjuncts broken in display by representing with Halant. That practically introduces multiple akshara boundaries in the script whereas it is still an unbreakable unit for a word in the language. This differentiation is lost because the exact same is possible to generate by using ZWNJ.

In short, these can never be done correctly, no matter what is defined within the existing standards.

On Mon, Feb 17, 2020 at 1:45 PM Norbert Lindenberg notifications@github.com wrote:

Is the expected behavior then that line breaks still occur at word boundaries, or can they occur at any aksara boundary?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3c/iip/issues/98?email_source=notifications&email_token=ABEELW54DKBMIPZFCIVCAJTRDJBSPA5CNFSM4KQFPIWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL5OJOA#issuecomment-586867896, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEELWYYDONS5GYCPGZ2FHLRDJBSPANCNFSM4KQFPIWA .

-- ବିବେକାନନ୍ଦ ପାଣୀ । विवेकानन्द पाणीVivekananda Pani

Research

+91-9449812397 [image: Address] https://www.google.com/maps/place/Reverie+Language+Technologies/@12.9213718,77.6642378,15z/data=!4m5!3m4!1s0x0:0x9ceafa1d8fa821f8!8m2!3d12.9213718!4d77.6642378 https://www.reverieinc.com/ [image: facebook] https://www.facebook.com/reverietech [image: twitter] https://twitter.com/reverietech [image: linkedin] https://www.linkedin.com/company/reverie-language-technologies-pvt--ltd/

-- The information contained in this e-mail message and/or attachments are confidential or privileged information of Reverie Language Technologies Pvt. Ltd. Unauthorized dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments.

r12a commented 4 years ago

@kulpreetchilana these were your words. Are you able to offer further details?

fantasai commented 4 years ago

Unclear to me what the space toggle switches from/to. ZWSP? Nothing? Something else?