shabados / gurmukhi-utils

Utilities library for converting, analyzing, and testing Gurmukhi strings.
MIT License
30 stars 9 forks source link

toAscii for non gurbani text #196

Closed sanbroz closed 2 years ago

sanbroz commented 2 years ago

Hi, Can we use toAscii and other translation function with general Punjabi text other then Gurbani text?

Harjot1Singh commented 2 years ago

Hi there!

Yes - it should work.

Out of interest, what is your use case?

bhajneet commented 2 years ago

Hi,

There is a website https://unicode.sarabveer.me/ that I believe uses gurmukhi-utils on the backend. It should give you an idea of what is possible.

If you run into any issues, please let us know or open a PR to this repo to fix it. I'm also interested how you're planning to use gurmukhi-utils, please let us know when you get a chance!

sanbroz commented 2 years ago

Thanks, I just wanted to build a cli tool to convert text files to and from Unicode text but found some errors in conversion.

Gurbani-Akhar sample text:

ibMRdwbn pRisD ieiqhwisk ngrI hY[ ikhw jWdw hY ik jmnw ndI dy kol ies sQwn ’qy iek jMgl sI ijQy kydwr dI puqRI ivMRdw qp krdI sI[ ies dy nW ’qy ies bn dw nW ibRMdwbn ipAw[ rwDw dw iek nW vI ivRMdw dsdy hn[ ivRMdw qulsI nMU vI ikhw jWdw hY[ jmnw dy iknwry qulsI dw jMgl sI ijs krky ies sQwn dw nW ibRMdwbn pY igAw[ sRI gurU gRMQ swihb ivc vI ies ngrI nMU ikRSn jI nwl joV ky dyiKAw jWdw hY[ gurU nwnk dyv jI &urmwauNdy hn:

it's translating first word to ਬਿ੍ਰੰਦਾਬਨ

Thanks for your help

On Mon, 10 Jan, 2022, 8:39 pm Bhajneet S.K., @.***> wrote:

Hi,

There is a website https://unicode.sarabveer.me/ that I believe uses gurmukhi-utils on the backend. It should give you an idea of what is possible.

If you run into any issues, please let us know or open a PR to this repo to fix it. I'm also interested how you're planning to use gurmukhi-utils, please let us know when you get a chance!

— Reply to this email directly, view it on GitHub https://github.com/shabados/gurmukhi-utils/issues/196#issuecomment-1008964669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEJOKDCYPNY2V73WBAVRUDUVLZC5ANCNFSM5LS7PH3A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

bhajneet commented 2 years ago

Imo, this example ibMRdwbn is incorrectly written in ASCII. It should be ibRMdwbn which does correctly translate to ਬ੍ਰਿੰਦਾਬਨ. Perhaps a pre-processing step that finds MR and swaps them to RM would be the solution?

sarabveer commented 2 years ago

ibRMdwbn is correct. Pronunciation goes: ਬ੍ਰ + ਿ + + ਦਾ + ਬਨ

ਬ੍ਰਿੰਦਾਬਨ = ਬ U+0A2C + ੍ U+0A4D + ਰ U+0A30 + ਿ U+0A3F + ੰ U+0A70 + ਦ U+0A26 + ਾ U+0A3E + ਬ U+0A2C + ਨ U+0A28