virtualvinodh / aksharamukha

Aksharamukha
161 stars 41 forks source link

Support Bengali Transcription #65

Open virtualvinodh opened 4 years ago

chakrabortydeepro commented 3 years ago

@virtualvinodh

V and B in Bengali

We certainly do not have v in Bengali. I catalogued more than 1000 Sanskrit manuscripts in the Bengali script and I found both Sanskrit sounds v and b (sometimes r as well) are represented with the single character ব্ . But in recent Bengali prints (I can share some samples if you require), we find a glyph ৰ্ (which is r in Assamese and in some Bengali manuscripts) that is used to represent b in contrast with v. But I don't think this character as a b is not recognized in Unicode. The Unicode ৰ্ is a r and therefore when used in a consonant cluster this character takes the undesired shape of pre- or postconsonental r and do not serve our purpose.

tv and tb

Your modification of tv and tb now looks perfect and gives the desired result. Thanks a lot.

virtualvinodh commented 2 years ago

@chakrabortydeepro @milindchakraborty

I just implemented (See: https://github.com/virtualvinodh/aksharamukha/issues/173) using /ৰ/ as b in Bengali. It should be live soon.

I had to jump through lots of hoops to implement it and it felt sort of very hacky.

I am thinking of writing a Unicode proposal. Could you both please provide me with some more images of publications that use /ৰ/ as b (ideally from different publishers)?

Cheers,

Vinodh

chakrabortydeepro commented 2 years ago

@virtualvinodh thank you so much for adding this feature to Aksharamukha. It would be excellent if this character gets into Unicode. Apart from the great examples given by @milindchakraborty, I could find a few more from some other publications:

  1. @book{1, address = {কলকাতা}, title = {মহাকবিশ্রীহর্ষদেবকৃত রত্নাবলী}, publisher = {সংস্কৃত বুক ডিপো}, editor = {সেনগুপ্তা, আচার্য্য জ্যোতি}, year = {2016}, } See: line 4 (p. 29)

    naṣṭaṃ mantrabalair...

image Thanks to @parthasarathisil for sending me this reference.

  1. @book{2, address = {কলকাতা}, edition = {পরিবর্ধিত ষষ্ঠ সংস্করণ}, title = {মহাকবি‍-কালিদাস-প্রণীতম্ অভিজ্ঞান-শকুন্তলম্}, publisher = {Sanskrit Pustak Bhandar}, editor = {চক্রবর্তী, সত্যনারায়ণ}, year = {2007}, } See p. 22

    ...cumbiāiṃ ...cumbitāni... image

p. 24 (line 1 and line 5)

...aho rāgabaddhacitta... ...samyag anubodhito 'smi... image

  1. @book{3, address = {কলিকাতা}, title = {ভাষা বিজ্ঞান নামক বাঙ্গালা ভাষার ব্যাকরণ}, publisher = {হিতবাদী লাইব্রেরী}, editor = {সান্যাল, দুর্গাচন্দ্র}, year = {সন ১৩১৬}, url = {https://n2t.net/ark:/13960/t0zp9bv61}, } p. 17 image

I'm translating this passage--

"b and v Rule: 23: In the earlier language, both the shape and pronunciation of the antyastha (sic) v were different than the vargīya b. In Sanskrit texts handwritten by pandits the shape of a vargīya b is written as ৰ. But in the Bengali printed alphabet, both b and v are identical and because of their similar shape, their pronunciation has also become similar. In order to prevent this undesired thing, I have emended the letter ৰ. Afterwards, the letter ব should be pronounced like the English letter V and ৰ should be pronounced like the English letter B."

Surely, I do not agree with what Durgachandra Sanyal says. Representing b as ৰ in Bengali appears to be a new feature developed most probably by analogy with the Devanagari letter ब. I have never seen b written as ৰ in Sanskrit manuscripts. Both b and v are identical. b and v are written differently in Assamese and Manipuri and this distinction (ব and ৱ) is sometimes visible in Sanskrit manuscripts (very rarely though).

You can also find blog posts like this where people represent the vargīya b as ৰ.

virtualvinodh commented 2 years ago

Thanks for this!

I think it should totally be a new letter.

Sorry for bothering you again. But if you could send some more examples with conjuncts say /rba/, /bba/, /bda/ and/dba/ that'd be helpful to show in the proposal that the character behaves differently from (Assamese /ra/) and hence needs a different encoding.

chakrabortydeepro commented 2 years ago

@virtualvinodh

  1. There is an example of a conjunct /mbi/ in the previous comment.

Here are some more:

  1. /bra/ from @book{4, address = {কলিকাতা}, edition = {দ্বিতীয় সংস্করণ}, title = {সংস্কৃত সাহিত্যের ইতিহাস}, publisher = {পশ্চিমবঙ্গ রাজ্য পুস্তক পর্ষৎ (West Bengal State Book Board)}, editor = {বন্দ্যোপাধ্যায়, ধীরেন্দ্রনাথ}, year = {2000}, } See p. 397: so 'bravīt IMG_20220524_060735

  2. /rba/ from the same book p. 452 subandhur bāṇabhaṭṭaś ca IMG_20220524_062433

  3. /bdhi/ from book 2 of the previous examples: p. 2 vedābdhim IMG_20220524_070411

  4. /bda/ from book 2 of the previous examples: p. 3 ...śabda... IMG_20220524_070916

virtualvinodh commented 2 years ago

Thanks for this!

I think it should totally be a new letter.

Sorry for bothering you again. But if you could send some more examples with conjuncts say /rba/, /bba/ or /dba/ that'd be helpful to show in the proposal that the character behaves differently from (Assamese /ra/) and hence needs a different encoding.

V

On Sun, 15 May 2022, 08:01 Deepro Chakraborty, @.***> wrote:

@virtualvinodh https://github.com/virtualvinodh thank you so much for adding this feature to Aksharamukha. It would be excellent if this character gets into Unicode. Apart from the great examples given by @milindchakraborty https://github.com/milindchakraborty, I could find a few more from some other publications:

  1. @book https://github.com/book{1, address = {কলকাতা}, title = {মহাকবিশ্রীহর্ষদেবকৃত রত্নাবলী}, publisher = {সংস্কৃত বুক ডিপো}, editor = {সেনগুপ্তা, আচার্য্য জ্যোতি}, year = {2016}, } See: line 4 (p. 29)

naṣṭaṃ mantrabalair...

[image: image] https://user-images.githubusercontent.com/34675461/168457050-0d7ca0b1-9381-4af7-9d3d-005d00704dfb.png Thanks to @parthasarathisil https://github.com/parthasarathisil for sending me this reference.

  1. @book https://github.com/book{2, address = {কলিকাতা}, title = {ভাষা বিজ্ঞান নামক বাঙ্গালা ভাষার ব্যাকরণ}, publisher = {Sanskrit Pustak Bhandar}, editor = {চক্রবর্ত্তী, সত্যনারায়ণ}, year = {1999}, } See p. 22

...cumbiāiṃ ...cumbitāni... [image: image] https://user-images.githubusercontent.com/34675461/168457535-3147a472-f066-45f7-960f-86a8a7c56982.png

p. 24 (line 1 and line 5)

...aho rāgabaddhacitta... ...samyag anubodhito 'smi... [image: image] https://user-images.githubusercontent.com/34675461/168457608-39f0248c-b211-4d9b-8db6-667b7145d891.png

  1. @book https://github.com/book{3, address = {কলিকাতা}, title = {ভাষা বিজ্ঞান নামক বাঙ্গালা ভাষার ব্যাকরণ}, publisher = {হিতবাদী লাইব্রেরী}, editor = {সান্যাল, দুর্গাচন্দ্র}, year = {সন ১৩১৬}, url = {https://n2t.net/ark:/13960/t0zp9bv61}, } p. 17 [image: image] https://user-images.githubusercontent.com/34675461/168457787-f9d881d9-9a78-4dc2-8199-383ee97fa913.png

I'm translating this passage--

"b and v Rule: 23: In the earlier language, both the shape and pronunciation of the antyastha (sic) v were different than the vargīya b. In Sanskrit texts handwritten by pandits the shape of a vargīya b is written as ৰ. But in the Bengali printed alphabet, both b and v are identical and because of their similar shape, their pronunciation has also become similar. In order to prevent this undesired thing, I have emended the letter ৰ. Afterwards, the letter ব should be pronounced like the English letter V and ৰ should be pronounced like the English letter B."

Surely, I do not agree with what Durgachandra Sanyal says. Representing b as ৰ in Bengali appears to be a new feature developed most probably by analogy with the Devanagari letter ब. I have never seen b written as ৰ in Sanskrit manuscripts. Both b and v are identical. b and v are written differently in Assamese and Manipuri and this distinction (ব and ৱ) is sometimes visible in Sanskrit manuscripts (very rarely though).

You can also find blog posts like this https://draminbd.com/%E0%A6%AC%E0%A6%BE%E0%A6%82%E0%A6%B2%E0%A6%BE-%E0%A6%AC%E0%A6%B0%E0%A7%8D%E0%A6%A3%E0%A6%AE%E0%A6%BE%E0%A6%B2%E0%A6%BE%E0%A7%9F-%E0%A6%AC-%E0%A6%85%E0%A6%A8%E0%A7%8D%E0%A6%A4%E0%A6%B8%E0%A7%8D/ where people represent the vargīya b as ৰ.

— Reply to this email directly, view it on GitHub https://github.com/virtualvinodh/aksharamukha/issues/65#issuecomment-1126867350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIASX6DMWBACHA3G6FUWE3VKCHKNANCNFSM4ODA4CHA . You are receiving this because you were mentioned.Message ID: @.***>

milindchakraborty commented 1 year ago

Thanks for this! I think it should totally be a new letter. Sorry for bothering you again. But if you could send some more examples with conjuncts say /rba/, /bba/ or /dba/ that'd be helpful to show in the proposal that the character behaves differently from (Assamese /ra/) and hence needs a different encoding. V On Sun, 15 May 2022, 08:01 Deepro Chakraborty, @.> wrote: @virtualvinodh https://github.com/virtualvinodh thank you so much for adding this feature to Aksharamukha. It would be excellent if this character gets into Unicode. Apart from the great examples given by @milindchakraborty https://github.com/milindchakraborty, I could find a few more from some other publications: 1. @book <https://github.com/book>{1, address = {কলকাতা}, title = {মহাকবিশ্রীহর্ষদেবকৃত রত্নাবলী}, publisher = {সংস্কৃত বুক ডিপো}, editor = {সেনগুপ্তা, আচার্য্য জ্যোতি}, year = {2016}, } See: line 4 (p. 29) naṣṭaṃ mantrabalair... [image: image] https://user-images.githubusercontent.com/34675461/168457050-0d7ca0b1-9381-4af7-9d3d-005d00704dfb.png Thanks to @parthasarathisil https://github.com/parthasarathisil for sending me this reference. 1. @book <https://github.com/book>{2, address = {কলিকাতা}, title = {ভাষা বিজ্ঞান নামক বাঙ্গালা ভাষার ব্যাকরণ}, publisher = {Sanskrit Pustak Bhandar}, editor = {চক্রবর্ত্তী, সত্যনারায়ণ}, year = {1999}, } See p. 22 ...cumbiāiṃ ...cumbitāni... [image: image] https://user-images.githubusercontent.com/34675461/168457535-3147a472-f066-45f7-960f-86a8a7c56982.png p. 24 (line 1 and line 5) ...aho rāgabaddhacitta... ...samyag anubodhito 'smi... [image: image] https://user-images.githubusercontent.com/34675461/168457608-39f0248c-b211-4d9b-8db6-667b7145d891.png 1. @book <https://github.com/book>{3, address = {কলিকাতা}, title = {ভাষা বিজ্ঞান নামক বাঙ্গালা ভাষার ব্যাকরণ}, publisher = {হিতবাদী লাইব্রেরী}, editor = {সান্যাল, দুর্গাচন্দ্র}, year = {সন ১৩১৬}, url = {https://n2t.net/ark:/13960/t0zp9bv61}, } p. 17 [image: image] https://user-images.githubusercontent.com/34675461/168457787-f9d881d9-9a78-4dc2-8199-383ee97fa913.png I'm translating this passage-- "b and v Rule: 23: In the earlier language, both the shape and pronunciation of the antyastha (sic) v were different than the vargīya b. In Sanskrit texts handwritten by pandits the shape of a vargīya b is written as ৰ. But in the Bengali printed alphabet, both b and v are identical and because of their similar shape, their pronunciation has also become similar. In order to prevent this undesired thing, I have emended the letter ৰ. Afterwards, the letter ব should be pronounced like the English letter V and ৰ should be pronounced like the English letter B." Surely, I do not agree with what Durgachandra Sanyal says. Representing b as ৰ in Bengali appears to be a new feature developed most probably by analogy with the Devanagari letter ब. I have never seen b written as ৰ in Sanskrit manuscripts. Both b and v are identical. b and v are written differently in Assamese and Manipuri and this distinction (ব and ৱ) is sometimes visible in Sanskrit manuscripts (very rarely though). You can also find blog posts like this https://draminbd.com/%E0%A6%AC%E0%A6%BE%E0%A6%82%E0%A6%B2%E0%A6%BE-%E0%A6%AC%E0%A6%B0%E0%A7%8D%E0%A6%A3%E0%A6%AE%E0%A6%BE%E0%A6%B2%E0%A6%BE%E0%A7%9F-%E0%A6%AC-%E0%A6%85%E0%A6%A8%E0%A7%8D%E0%A6%A4%E0%A6%B8%E0%A7%8D/ where people represent the vargīya b as ৰ. — Reply to this email directly, view it on GitHub <#65 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIASX6DMWBACHA3G6FUWE3VKCHKNANCNFSM4ODA4CHA . You are receiving this because you were mentioned.Message ID: @.>

Indeed. This should be a totally new letter keeping conjuncts in mind. Do keep me posted if there are any Unicode updates. I am sorry for not being able to contact you earlier, as I mentioned in the other post.

Regards Milind Chakraborty

milindchakraborty commented 1 year ago

@chakrabortydeepro @milindchakraborty

I just implemented (See: #173) using /ৰ/ as b in Bengali. It should be live soon.

I had to jump through lots of hoops to implement it and it felt sort of very hacky.

I am thinking of writing a Unicode proposal. Could you both please provide me with some more images of publications that use /ৰ/ as b (ideally from different publishers)?

Cheers,

Vinodh

@virtualvinodh Also while reviewing the updates as implemented in https://github.com/virtualvinodh/aksharamukha/issues/173

I noticed a few discrepancies that might need to be rectified in Aksharamukha Bangla transliteration.

I tried using the দৃঢ আষাঢ → দৃঢ় আষাঢ় feature on the following Devanagari words and this is what I got...

Input : वयस्क, बुद्धि, आडंबर, युद्ध, जाड्य, गड्डलिका, ढाका, डमरु, सय्यम, ओड्र, आढ्य, हार्ड, वीर्य, दृढ, द्राविड, खड्ग, षड्विधि Output (without feature) : বয়স্ক, বুদ্ধি, আডম্বর, যুদ্ধ, জাড্য, গড্ডলিকা, ঢাকা, ডমরু, সয়্যম, ওড্র, আঢ্য, হার্ড, বীর্য, দৃঢ, দ্রাবিড, খড্গ, ষড্বিধি Output (with feature) : বয়স্ক, বুদ্ধি, আড়ম্বর, যুদ্ধ, জাড়্য, গড়্ড়লিকা, ঢাকা, ডমরু, সয়্যম, ওড়্র, আঢ়্য, হার্ড়, বীর্য, দৃঢ়, দ্রাবিড়, খড়্গ, ষড়্বিধি

Okay so, as I had mentioned, the letters ড়, ঢ় and য় behave similarly. Word-initially we always use ড, ঢ and য. (Ex. ঢাকা, ডমরু, যুদ্ধ)

In non-initial positions ড, ঢ and become ড়, ঢ় and - only in non-conjuncts. বয়স্ক, আড়ম্বর, দৃঢ়, দ্রাবিড় are correct; জাড়্য, গড়্ড়লিকা, সয়্যম, ওড়্র, আঢ়্য, হার্ড় are not. I will come to খড়্গ, ষড়্বিধি later. They should be, জাড্য, গড্ডলিকা, সয্যম, ওড্র, আঢ্য, হার্ড

ড্য, ড্ড, য্য, ড্র, ঢ্য, র্ড, র্য etc are conjuncts and will never have ড়, ঢ় or য় in them. Interestingly বীর্য is rendered correctly, probably because ya has been encoded differently. ড, ঢ, য will stay as they are as a part of a consonant cluster. That is, if they are adjacent to a VIRAMA on either side.

There are only a count few exceptions to this rule, afaik. ड्ग becomes ড়্গ in Bangla orthography and not ড্গ, খড়্গ is correctly encoded. And by extension of this ड्क should be ড়্‌ক (We have words like খিড়কি, হুড়কো which are pronounced খিড়্‌কি, হুড়্‌কো). This should be individually encoded. If I were to extend the rule phonologically, ड्व should be ড্ব, but ड्ब should become ড়্‌ব (We have words like পড়বে that are read as পোড়্‌বে). By extension, ड्प should be ড়্‌প. Also for त and द - ড়্‌ত, ড়্‌দ (বাড়তি, বড়দা pronounced বাড়্‌তি, বড়্‌দা). For च and ज - ড়্‌চ, ড়্‌জ (কড়চা pronounced কড়্‌চা). The first note of Saregamapa is called षड्ज and becomes ষড়্‌জ in Bangla. I guess you see the pattern. ড becomes ড় in the conjuncts with varga letters. I can't really say about the aspirated ones because I haven't seen such conjuncts like ড়্‌খ/ ড়্‌ভ/ ড়্‌ধ, but about the non-aspirate ones it does. ঢ, য should ideally follow this, but I haven't seen such conjuncts like য্‌গ, য্‌ক, ঢ়্‌গ, ঢ়্‌ব, ঢ়্‌চ etc. Most of these conjuncts plausibly don't exist in Sanskrit either, it is just an extension of the phonological cause of this rule and expected behaviour. Also, this should be mentioned that the nasal conjuncts ण्ड/ ंड, ण्ढ/ ंढ, ंय stay ণ্ড, ণ্ঢ, ংয; they don't take ড়, ঢ়, য় (কাণ্ড, কাণ্ঢার, সংযম). Following anuswara or even visarga. ড, ঢ, য don't change to ড়, ঢ়, য় when they geminate or if they are attached to non-varga letters, like য, র, ল, ব etc.

On the off-note, ষড়্‌বিধি is the correct way to write षड्विधि (with ZWNJ), but that just happens to have ড় because it is a result of Sandhi, as we discussed earlier "षट्‌ + विधि". Bangla doesn't write Sandhi-joined consonants, such as the ones that get voiced while joining them, as consonant conjuncts but by explicitly showing hasanta (VIRAMA), for example, বাগ্‌ধারা (not বাগ্ধারা), বাক্‌যুদ্ধ (not বাক্যুদ্ধ), ঋগ্‌বেদ (not ঋগ্বেদ), সদ্‌ব্যবহার (not সদ্ব্যবহার), প্রাক্‌কথন (not প্রাক্কথন). So to Bangla, it isn't ড্ব but ড্‌ব (with ZWNJ). Hence the change to ড়; ড়্‌ব. Since Sandhi is beyond the scope of Aksharamukha, ड्व will stay ড্ব only, like all clusters.

Also as we discussed earlier in the case of clusters, I would like Aksharamukha to render VIRAMA+VA and VIRAMA+BA differently for Bangla. VIRAMA+BA in Devanagari should become VIRAMA+ZWNJ+BA in Bangla, while VIRAMA+VA stays VIRAMA+BA. This is in accordance with both West Bengal Bangla Academy as well as Bangladesh Bangla Academy.

I rendered the following words...

Input: उद्बोधन, विद्वान, ज्वर, क्वाथ, मातृत्व, उद्बुद्ध, बल्ब, गर्व, कार्ब, किंवा, संबंध Output: উদ্বোধন, বিদ্বান, জ্বর, ক্বাথ, মাতৃত্ব, উদ্বুদ্ধ, বল্ব, গর্ব, কার্ব, কিম্বা, সম্বন্ধ

In clusters, Bangla treats VA and BA differently, even in pronunciation. उद्बोधन will be udbodhôn while विद्वान will be biddān. In case of anuswara too, किंवा is kiṅbā and संबंध is šômbôndho.

Except for ṁb/mb (ंब/म्ब > ম্ব), rb/rv (र्ब/र्व > র্ব), tb (त्ब > ৎব), everywhere else VIRAMA+BA should be rendered as VIRAMA+ZWNJ+BA. উদ্‌বোধন, উদ্‌বুদ্ধ, বাল্‌ব, সদ্‌বুদ্ধি (not উদ্বোধন, উদ্বুদ্ধ, বাল্ব, সদ্বুদ্ধি)

Note: সম্বন্ধ, কার্ব (संबंध, कार्ब) won't be written সম্‌বন্ধ, কার্‌ব. Also, I mentioned 'tb' to have a separate encoding because, कत्बेल (katbela) should become কৎবেল, while आमित्वेर (āmitver) should become আমিত্বের. There is no ZWNJ at play here, but tb (त्ब > ৎব) and tv (त्व > ত্ব) difference is shown in the script.

Also, ṁv (ंव) should never be rendered as ম্ব but as ংব. It is কিংবা not কিম্বা.

P.S. About Bangla to Devanagari conversion, I would suggest a few things... I notice that , as a rule, is being rendered as ba in Bangla, all the time.

I tried this set of Sanskrit Shlokas:

Input: ওঁ সর্ব মঙ্গল মাঙ্গল্যে শিবে সর্বার্থ সাধিকে। শরণ্যে ত্র্যম্বকে গৌরি নারায়ণি নমোঽস্তু তে।।

হে কৃষ্ণ করুণাসিন্ধু দীনবন্ধু জগৎপতে। গোপেশ গোপিকাকান্ত রাধাকান্ত নমোঽস্তু তে।। ওঁ ব্রহ্মণ্য দেবায় গোব্রহ্মণ্য হিতায় চ। জগদ্ধিতায় কৃষ্ণায় গোবিন্দায় বাসুদেবায় নমো নমঃ।।

গুরুর্ব্রহ্মা গুরুর্বিষ্ণু গুরুর্দেবো মহেশ্বরঃ। গুরুরেব পরব্রহ্ম তস্মৈ শ্রীগুরবে নমঃ।। অখণ্ডমণ্ডলাকারং ব্যাপ্তং যেন চরাচরম্। তৎপদং দর্শিতং যেন তস্মৈ শ্রীগুরবে নমঃ।।

ওঁ তদ্বিষ্ণুঃ পরমং পদং সদা পশ্যন্তি সূরয়ঃ দিবীব চক্ষুরা ততম্‌।

Output: ॐ सर्ब मङ्गल माङ्गल्ये शिबे सर्बार्थ साधिके। शरण्ये त्र्यम्बके गौरि नाराय़णि नमोऽस्तु ते॥

हे कृष्ण करुणासिन्धु दीनबन्धु जगत्पते। गोपेश गोपिकाकान्त राधाकान्त नमोऽस्तु ते॥ ॐ ब्रह्मण्य देबाय़ गोब्रह्मण्य हिताय़ च। जगद्धिताय़ कृष्णाय़ गोबिन्दाय़ बासुदेबाय़ नमो नमः॥

गुरुर्ब्रह्मा गुरुर्बिष्णु गुरुर्देबो महेश्बरः। गुरुरेब परब्रह्म तस्मै श्रीगुरबे नमः॥ अखण्डमण्डलाकारं ब्याप्तं येन चराचरम्। तत्पदं दर्शितं येन तस्मै श्रीगुरबे नमः॥

ॐ तद्बिष्णुः परमं पदं सदा पश्यन्ति सूरय़ः दिबीब चक्षुरा ततम्।

This is the output we get. What is noticeable here is that in Sanskrit 'v' occurs way more often than 'b'. This was also the reason why they used a new character for ba (ৰ) instead of va (ব) in Sanskrit texts, to make the texts look not too different from traditional orthography.

From what I observe from the Sanskrit texts I've seen in Odia online, even they stick to for va everywhere while using for ba. People claim the opposite sometimes, but I am stating what I have observed in Odia transliterations of Sanskrit Slokas online.

Odia Examples: Bhagwad Geeta in Odia: କର୍ମଯୋଗ ବିନା ଅର୍ଜୁନ | କେବଳ କର୍ମତ୍ୟାଗ ଜାଣ || ଦୁଃଖନାଶ କରି ନଥାଇ | ନା ଯୋଗ ସିଦ୍ଧି ଦେଇଥାଇ || କର୍ମଯୋଗରେ ଯୁକ୍ତ ନର | ଅଚିରେ ଵ୍ରହ୍ମ ପ୍ରାପ୍ତି ତା'ର || ୬ || (This is in Odia and not even in Sanskrit)

ଓଁ ତତ୍ସଦିତି ଶ୍ରୀମଦ୍ଭଗବଦ୍ଗୀତାସୂପନିଷତ୍ସୁ ଵ୍ରହ୍ମବିଦ୍ୟାୟାଂ ଯୋଗ ଶାସ୍ତ୍ରେ ଶ୍ରୀକୃଷ୍ଣାର୍ଜୁନ ସମ୍ବାଦେ କର୍ମସନ୍ୟାସ ୟୋଗୋ ନାମ ପଞ୍ଚମ ଅଧ୍ୟାୟଃ || ୫ || (This is Sanskrit) (Source: https://write.as/geetanabakshyari/pnycm-adhyyaayy-krmsnyyaas-yog)

କର୍ମ ଵ୍ରହ୍ମୋଦ୍ଭବଂ ବିଦ୍ଧି ଵ୍ରହ୍ମାକ୍ଷରସମୁଦ୍ଭବମ୍। ତସ୍ମାତ୍ସର୍ବଗତଂ ଵ୍ରହ୍ମ ନିତ୍ଯଂ ଯଜ୍ଞେ ପ୍ରତିଷ୍ଠିତମ୍॥୧୫॥

ଅହଂ ସର୍ବସ୍ଯ ପ୍ରଭବୋ ମତ୍ତଃ ସର୍ବଂ ପ୍ରବର୍ତତେ। ଇତି ମତ୍ବା ଭଜନ୍ତେ ମାଂ ଵୁଧା ଭାବସମନ୍ବିତାଃ॥୮॥ ମଚ୍ଚିତ୍ତା ମଦ୍ଗତପ୍ରାଣା ଵୋଧଯନ୍ତଃ ପରସ୍ପରମ୍। କଥଯନ୍ତଶ୍ଚ ମାଂ ନିତ୍ଯଂ ତୁଷ୍ଯନ୍ତି ଚ ରମନ୍ତି ଚ॥୯॥ (Source: http://kevincarmody.com/vedic/bhagavadgitaori.html)

Anyway, except for त्र्यम्बके (which obviously has mb and not ṁb so can only be 'b' and not 'v' like we discussed above), दीनबन्धु, ब्रह्मण्य, गुरुर्ब्रह्मा, and परब्रह्म everywhere else 'va' should have been used.

So I suggest four things 1. There should be a feature that we can select to render all b's as v's. 2. Always render ম্ব as mb and not mv, in any case. 3. Always render ত্ব as tv and ৎব as tb, in any case, there are no exceptions to this. (And vice versa) 4. VIRAMA+ZWNJ+BA in source text will always be rendered as VIRAMA+BA and never with VA. (Ex. উদ্‌বোধন, প্রাগ্‌বোধ, বাগ্‌ব্রহ্ম will always be उद्बोधन, प्राग्बोध, वाग्ब्रह्म)

Note: In case it is implemented, or planned, differently, the conjunct form tb, when we use the ৰ-for-b and ব-for-v orthography, should be ৎৰ only and not look like ত্ব. Here the behaviour of BENGALI LETTER ALTERNATE BA will differ from BENGALI LETTER BA in Unicode terms. Does this call for the proposal to be revised a bit since ত্ব in Bangla is reserved for tv while ৎব for tb?

photo_2023-06-10_21-54-48

(My friend Sagir Ahmed drew this; a few sample conjuncts and how they should look going by behaviour and readability.)

Also incorporate the feature of rendering both য়, য as ya in the target language (without showing the kind of ya). And another feature that renders both pairs of ড, ড় and ঢ, ঢ় as ḍa and ḍha.

Moreover, there should be a feature to select using ৰ for ba and ব for va on the Bangla to another language side as well. In this case, all ৰ's will be ba and ব's will be va; therefore if this feature is turned on, সম্‌ৰন্ধ, সদ্‌ৰ‌্যবহার, কৎৰেল will be rendered as सम्बन्ध, सद्ब्यवहार, कत्बेल, while সম্বন্ধ/সম্‌বন্ধ, সদ্‌ব্যবহার, কৎবেল will be सम्वन्ध, सद्व्यवहार, कत्वेल (which would otherwise always be rendered as सम्बन्ध, सद्ब्यवहार, कत्बेल irrespective of what feature is being used).

Regards Milind Chakraborty

milindchakraborty commented 1 year ago

I catalogued more than 1000 Sanskrit manuscripts in the Bengali script and I found both Sanskrit sounds v and b (sometimes r as well) are represented with the single character ব্ .

@chakrabortydeepro A bit off topic, but I would like to know if these manuscripts (or a part thereof) have been digitised by you? As well as any non-Sanskrit Bangla manuscripts; and if it is possible for me to study them?

Also, if it isn't too much to ask for, can you please show me some six to eight pictures each of how শ্রী and ওঁ are written in these Manuscripts from different (preferably older) Manuscripts along with the name/source of the manuscript and the year it was written?

chakrabortydeepro commented 1 year ago

The manuscripts were digitized by the Centre for Studies in Social Sciences, Calcutta under the Endangered Archive Programme. You can access the archive here https://eap.bl.uk/project/EAP781.

Here https://docs.google.com/spreadsheets/d/1WeN2zZe07jiREZnxRNz08Ywh3FZHjEUHzpkzvW9MsX8/edit?usp=sharing you can access the preliminary datasheet that I prepared. On the datasheet, you can check the manuscripts that are dated and then you can look for their images with the exact title on the EAP homepage https://eap.bl.uk/search?query=.

On Thu, Jun 15, 2023 at 12:00 AM milindchakraborty @.***> wrote:

I catalogued more than 1000 Sanskrit manuscripts in the Bengali script and I found both Sanskrit sounds v and b (sometimes r as well) are represented with the single character ব্ .

@chakrabortydeepro https://github.com/chakrabortydeepro A bit off topic, but I would like to know if these manuscripts (or a part thereof) have been digitised by you? As well as any non-Sanskrit Bangla manuscripts; and if it is possible for me to study them?

Also, if it isn't too much to ask for, can you please show me some six to eight pictures each of how শ্রী and ওঁ are written in these Manuscripts from different (preferably older) Manuscripts along with the name/source of the manuscript and the year it was written?

— Reply to this email directly, view it on GitHub https://github.com/virtualvinodh/aksharamukha/issues/65#issuecomment-1592416724, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIRWBKL6HXPRNUE6VVQ3YLXLKQGHANCNFSM4ODA4CHA . You are receiving this because you were mentioned.Message ID: @.***>

-- Deepro Chakraborty (he/him) PhD candidate Department of History, Classics, and Religion University of Alberta

The University of Alberta acknowledges that we are located on ᐊᒥᐢᑿᒌᐚᐢᑲᐦᐃᑲᐣ (Amiskwacîwâskahikan) Treaty 6 territory, and respects the history, languages, and cultures of the First Nations, Métis, Inuit, and all First Peoples of Canada, whose presence continues to enrich our institution.

milindchakraborty commented 1 year ago

The manuscripts were digitized by the Centre for Studies in Social Sciences, Calcutta under the Endangered Archive Programme. You can access the archive here https://eap.bl.uk/project/EAP781. Here https://docs.google.com/spreadsheets/d/1WeN2zZe07jiREZnxRNz08Ywh3FZHjEUHzpkzvW9MsX8/edit?usp=sharing you can access the preliminary datasheet that I prepared. On the datasheet, you can check the manuscripts that are dated and then you can look for their images with the exact title on the EAP homepage https://eap.bl.uk/search?query=. On Thu, Jun 15, 2023 at 12:00 AM milindchakraborty @.> wrote: I catalogued more than 1000 Sanskrit manuscripts in the Bengali script and I found both Sanskrit sounds v and b (sometimes r as well) are represented with the single character ব্ . @chakrabortydeepro https://github.com/chakrabortydeepro A bit off topic, but I would like to know if these manuscripts (or a part thereof) have been digitised by you? As well as any non-Sanskrit Bangla manuscripts; and if it is possible for me to study them? Also, if it isn't too much to ask for, can you please show me some six to eight pictures each of how শ্রী and ওঁ are written in these Manuscripts from different (preferably older) Manuscripts along with the name/source of the manuscript and the year it was written? — Reply to this email directly, view it on GitHub <#65 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIRWBKL6HXPRNUE6VVQ3YLXLKQGHANCNFSM4ODA4CHA . You are receiving this because you were mentioned.Message ID: @.> -- Deepro Chakraborty (he/him) PhD candidate Department of History, Classics, and Religion University of Alberta The University of Alberta acknowledges that we are located on ᐊᒥᐢᑿᒌᐚᐢᑲᐦᐃᑲᐣ (Amiskwacîwâskahikan) Treaty 6 territory, and respects the history, languages, and cultures of the First Nations, Métis, Inuit, and all First Peoples of Canada, whose presence continues to enrich our institution.

I saw this manuscript of Viṣṇusahasranāmaṁ, which is stated to be copied in śaka 1238, i.e., 1316 CE. Is this date mentioned anywhere in the manuscript? Or is the copy rather new?

image