drdhaval2785 / SanskritSorting

Codes written by Dr. Dhaval Patel for Sanskrit Natural Language Programming
2 stars 1 forks source link

accent after ṁ in #aṁ/sa# #26

Closed drdhaval2785 closed 9 years ago

drdhaval2785 commented 9 years ago

aṁ/sa

PWK has accute accent mark after ṁ. ṁ can be treated both as vowel and a consonant. Where to sort it?

Grammar proof of it being both vowel and consonant (Siddhantakaumudi)

capture

gasyoun commented 9 years ago

I hope that it's just a mistake, nothing more based on what I see at https://github.com/drdhaval2785/SanskritSorting/blob/master/accent_old2new.php

I suppose it's |([/\\^])([aAiIuUfFeEoO])| line in @funderburkjim code. He forgot (I suppose), that the accents that where after in words with were put after it.

amca

In older version: #/aṁsa# Jim's version: #aṁ/sa#\ My version: #a/ṁsa#

In MW:

a/MSa
a/MSa--karaRa
a/MSa--kalpanA
a/MSa--prakalpanA
a/MSa--pradAna
a/MSa--BAgin
a/MSa--BAj
a/MSa--BU/
a/MSa--BUta
aMSa-rUpiRI
a/MSa--vat
a/MSa--savarRana
a/MSa--svara
a/MSa--hara
a/MSa--hArin

In PWK:

afRin
a/MSa
aMSaka

And only in PWG 176 cases of M/:

afRin
aM/Sa
aMSa
aMsa
aM/sa
aMsakUwa
aM/satra
aM/satrakoSa
aMsaDrI/
aMsaBAra
aM/saBArika

So it's an input issue, not Jim's.

drdhaval2785 commented 9 years ago

the code seemed innocuous to me

drdhaval2785 commented 9 years ago

This being a correction, let's shift it to corrections repository. Nothing of importance for sorting repo.

drdhaval2785 commented 9 years ago

https://github.com/sanskrit-lexicon/CORRECTIONS/issues/19

noted here. Closing the issue

funderburkjim commented 9 years ago

re line

$z = preg_replace('|([/\\^])([aAiIuUfFeEoO])|',"$2$1",$y);

in https://github.com/drdhaval2785/SanskritSorting/blob/master/accent_old2new.php

This particular step does the work of moving accents from before a vowel to after a vowel. The first argument of preg_replace is the regular expression to search for (accent)(vowel). The second argument ($2$1) is the replacement, and means (vowel)(accent).

So the code looks ok.

This program is applied to monier.xml in deriving mw.xml.

In PWG, Thomas has already put the accents AFTER the vowel, so there is no need to adjust vowel placement.

gasyoun commented 9 years ago

"In PWG, Thomas has already put the accents AFTER the vowel, so there is no need to adjust vowel placement." so that is the reason why PWG is different. And according to the other files as they are up to date - wrong. In MW:

a/MSa

In PWK:

a/MSa

In PWG:

aM/Sa

Please fix PWG to be like MW and PWK or vice versa. Please no double and tripple standards, we have enough of them by now.

funderburkjim commented 9 years ago

I asked Peter about this. In brief, he agrees that accents should go after the vowel, not after the anusvAra. Here's my question and Peter's response:

Jim:

Dhaval noted with word 'aMSa' in PWG, that Thomas coded  an (udAtta) accent AFTER M;
so with slp1 versions of accents:  aM/Sa.

He thought it should be after the 'a':  a/MSa.
Some discussion occurs in https://github.com/drdhaval2785/SanskritSorting/issues/26

Within the PWG text this M+accent form occurs many times (1700+).

Should we change this detail of the PWG coding to change vowel+M+accent to vowel+accent+M ?

Peter:

Yes.
SLP uniformly puts nasalization (~) after accent (/^\).
AnusvAra is a different matter.  Accent goes after the vowel, never after a consonant.  AnusvAra is a 
consonant, not a vowel, as opposed to the passage cited by drdhaval.  A few Sikza texts make 
ambiguous statements about anusvAra because it does get an anudAtta mark in Yajurvedic texts 
when the preceding vowel is low pitched.  SLP permits marking M with \, and H with accents for similar 
reasons.  In the dictionaries, M should never have an accent following it; the accent should always 
immediately follow a normal vowel.  This way the treatment of anusvAra (M) and nasalization (~) is 
parallel.

I've already implemented the headword key2 change for a/MSa in PWG that Dhaval submitted. And I'll now implement in PWG the analogous changes (for udAtta and other accents) in PWG, so that the result is (vowel) + (accent) + (anusvara).

drdhaval2785 commented 9 years ago

For the sake of record let me clear that the quoted passage is not from any ambiguous Shiksha text. It is from grammar book named siddhAntakaumudI. The book is revered by almost all traditional scholars.

And counting the anusvAra as vowel has its own application in derivation of form of saMskartA. capture capture capture

funderburkjim commented 9 years ago

re: 'And counting the anusvAra as vowel has its own application in derivation of form of saMskartA.'

My knowledge of Sanskrit is too limited for the quoted passages. Could you paraphrase an explanation?

drdhaval2785 commented 9 years ago

@funderburkjim The rule 'anaci ca' अनचि च mandates - when there is vowel-consonant-non vowel pattern -> the second member gets duplicated. संस्कर्ता = स्‌+अ+ं+स्‌+क्‌+अर्ता Here - the one pattern अ+ं+स्‌ would qualify for duplication of ं because ं is treated as consonant. so अ(vowel)+ं(consonant)+स्‌(not vowel) -> अ+ं+ं+स्‌ -> so we get 'saMMskartA' (Note the double anusvAra)

In second case स्‌ also gets duplication. ं+स्‌+क्‌ -> +ं(Treated as vowel)+स्‌(consonant)+क्‌(not vowel). Therefore the conditions of application of rule 'anaci ca' are satisfied. So we get +ं+स्‌+स्‌+क्‌ -> संस्स्कर्ता (Note the double 's').

The commentator explains this phenomenon by resorting to the following explanation: anusvAra, visarga, jihvAmUlIya, upadhmAnIya and yam - these five things are counted twice in the varNamAlA (mAhezvarasUtra). Their first occurrence is after 'a'. The second occurrence is in 'zar'. Let me explain this thing.

This is the arrangement of vowels and consonants in grammatical scheme of things known as mAhezvarasUtra १. अ इ उ ण्। २. ऋ ऌ क्। ३. ए ओ ङ्। ४. ऐ औ च्। ५. ह य व र ट्। ६. ल ण्। ७. ञ म ङ ण न म्। ८. झ भ ञ्। ९. घ ढ ध ष्। १०. ज ब ग ड द श्। ११. ख फ छ ठ थ च ट त व्। १२. क प य्। १३. श ष स र्। १४. ह ल्।

The author says - the five entities noted above are counted twice in this mAhezvarasUtra. First after अ and second in the group शर्‌. १ अ ं ः ᳲ ᳲ इ उ ण्‌ (I don't know how to write यम्‌. Something of Vedic importance) १३ श ष स ं ः ᳲ ᳲ र्‌ ।

In sanskrit group 1 to 4 and their दीर्घ counterparts are vowels (अच्‌) group 5 to 14 are consonants (हल्‌).

As these five members are counted in group 1 and also in group 13, they are both vowel as well as consonants.

funderburkjim commented 9 years ago

What a thoughtful explanation. Thank you for providing a taste of the kind of reasoning that is used in Sanskrit grammar.

Somewhere, I encountered Shiva Sutras, but had not understood an example of how they are used. I have a pdf copy of Laghu Kaumudi by James Ballentyne. Do you consider this a relatively reliable source for someone who wants to pursue such grammatical topics?

drdhaval2785 commented 9 years ago

Laghu kaumudi is an abridged version of Siddhantakaumudi. So, it should be a reliable source.