Misc. headword corrections #252

funderburkjim commented 8 years ago

While dealing with the ACC spelling error noticed in #250, Several other random headword spelling errors were noticed in ACC and some other dictionaries. I'm putting these here, and may add some others before implementing the corrections.

ap:garutmata:garutmat:t:  virAma hard to see in print
pw:gardayitru:gardayitnu:t:  ligatures for 'tn' and 'tr' are similar 
ccs:tanatitnu:tanayitnu:t: tit -> yit
mw:gaqayitnu:gaqayitnu:n: Not a typo, cf gardayitnu. Different than gadayitnu
acc:lokapradopAnvayacandrikAnidAna:lokapradIpAnvayacandrikAnidAna:t: pradOpa -> pradIpa (Devanagari similar for dI and dO )
acc:karmatattvapradopikA:karmatattvapradIpikA:t: pradop->pradIp
acc:prAkftapradopikA:prAkftapradIpikA:t: pradop->pradIp
acc:kAlikAdopadAnaviDi:kAlikAdopadAnaviDi:n: ? or kAlikAdIpadAnaviDi
mw:sambanDopadfSa:sambanDopadeSa:t: dfSa -> deSa
mci:sundopasundayor:sundopasundayor:n: ? A dual form. first of two words,e 'upAKyAnam' as 
ieg:svacCandopaBogen:svacCandopaBogen:n: Is 'en' a Sanskrit word ending?
ap90:vAjasanoyen:vAjasaneyin:t:  Scan obsure
ap90:vyatiroken:vyatirekin:t: Devanagari print easy to misinterpret
mw:kakun:kakun:n: in compound for kakud
mw:kasun:kasun:n: grammatical term. also in ccs, md, pw
mw:vidyun:vidyun:n: in compound for vidyut
cae:druhUn:druhvan:t: difficult Devanagari ligature
ieg:ugappArpon:ugappArpon:n: Tamil. 11 words in ieg end in 'on'.
bop:maGon:maGon:n: v. maGavan
; 1906 headwords in hwnorm1_v1b.txt end in 'ar'.  For the most part,
; these appear to be variant spellings equivalent to ending 'f'.
; eg., 'kar' in CCS, GRA, PW, PWG, SCH is same as 'kf' in 
; This correspondence probably should be added to hwnorm1 rules.
; there are 21 words that (a) appear only in PUI, and (b) end in 'ini'.
; Examination of a few suggests that these are properly spelled in Sanskrit
; as ending in 'inI' (the feminine of 'in' adjective). Tempted to 'correct'
; key1 in these cases, so that these words will be 'reachable' as Sanskrit words.
acc:kUrmapUrARa:kUrmapurARa:p: This 'pUrARa' spelling occurs in vol. 3, the expected 'kUrmapurARa' spelling occurs in vol. 1
mw:AmardakatirTanATa:AmardakatIrTanATa:p: in supplement. Surely 'tirTa' should be 'tIrTa'
funderburkjim commented 8 years ago

mw:amatis,13758:amati:t: not sure why text shows (is) in the adjective form. image

funderburkjim commented 8 years ago

mw:amatapadArTa:amataparArTa:n: possibly could be amata-parArTa See @zaaf2 comment below for why leaving this as no change for now.

mw:amatapadArTa:amataparArTa:p: Confirm by MW72 and PW

@drdhaval2785 or @zaaf2 I'm fairly certain of this change, but would like a second opinion.

MW: image

MW72: image

PW: image

Google Translate: eine zweite nicht zu billigende Bedeutung habend. ->

have a second not to approve importance.

gasyoun commented 8 years ago

have a second not to approve importance.

having a second meaning, that is not that important

zaaf2 commented 8 years ago

MW अमतपदार्थ seems to be the correct form.

पदार्थ [p= 583] : m. the meaning of a word [= German: Bedeutung]

पद n. a word

परार्थ a [p= 587] : m. the highest advantage or interest, an important object.

पर “highest” does not fit here. The Google translation “a second not to approve importance” makes no sense. It is “a second not to approve meaning”.

The original sense of Bedeutung is “meaning”, “sense”; only as a semantical development it has acquired the meaning “importance”. Cf. Deutsches Wörterbuch von Jacob Grimm und Wilhelm Grimm: BEDEUTUNG, f. 1) interpretation; 2) significatio, vis, auctoritas; BEDEUTUNGSLOS, insignificans; BEDEUTUNGSVOLL, significans, gravis

funderburkjim commented 8 years ago

@zaaf2 Thanks for input - I'll leave amata-padArTa in MW unchanged for now.

My first close encounter with the brothers Grimm :)

funderburkjim commented 8 years ago

ap90:aMrhata:arhaMta:t: confirmed ap;

funderburkjim commented 8 years ago

stc:anujnA:anujYA:p: print missing tilde over 'N' stc:aBijnA:aBijYA:p: print missing tilde over 'N' stc:avajnA:avajYA:p: print missing tilde over 'N' stc:parijnAna:parijYAna:p: print missing tilde over 'n'

funderburkjim commented 8 years ago

ccs:akanizwa:akanizWa:t: cae:kanizwaka:kanizWaka:p: ap:kanizwa:kanizWa:p:

funderburkjim commented 8 years ago

These four headwords in AP have alternate forms, which the AP headword extraction algorithm does not properly handle; the result is that a parenthesis erroneously remains in key1: equ(qU gadyARa(na drekka(kkA peSa(za

funderburkjim commented 8 years ago

md:akfcCriMn:akfcCrin:p: Devanagari has an extra anusvAra. Confirm MW image

funderburkjim commented 8 years ago

cae:SIban:SIvan:t: print is poor. Alphabetical order supports 'v'


funderburkjim commented 8 years ago

acc:Sukranoti:SukranIti:p: I/O confusion in Devanagari


shs:nirviDna:nirviGna:p: headword shows Devanagari 'D'


funderburkjim commented 8 years ago

yat:nirvvurdDi:nirbbudDi:p: 'rDi' is wrong. 'vv' should be 'bb' - but yat is careless about b/v.


funderburkjim commented 8 years ago

ap:nirhnAdaH:nirhrAdaH:t: confusion between 'hr' and 'hn' Devanagari ligatures


ccs:niSAniZam:niSAniSam:t: only 'Z' in SLP1 form of headwords.

funderburkjim commented 8 years ago



funderburkjim commented 8 years ago

There are 5 headwords whose spelling includes 'zn'. I think this is not an optional form for 'zR' (SLP1 spelling), and thus should be corrected MW confirms the 'zR' spelling bur:akzna:akzRa:p: 'zn' not possible. MW confirms bur:akznA:akzRA:p: 'zn' not possible. MW confirms as instrumental of akzi


wil:akzna:akzRa:p: 'zn' not possible image

shs:akzna:akzRa:p: 'zn' not possible. SHS copies Wilson image

stc:kfznAyati:kfzRAyati:p: 'zn' not possible. Confirm MW kfzRAya image

stc:tEkznya:tEkzRya:p: 'zn' not possible. Confirm MW and many other dictionaries image

wil:pakznu:pakzRu: 'zn' not possible. Confirm MW and other dictionaries image

yat:pakznu:pakzRu: 'zn' not possible. Confirm MW and other dictionaries.


drdhaval2785 commented 8 years ago

Regarding 'zn' not possible - a word of caution. क्षुभ्नादिषु च rule by Panini prevents conversion of 'n' to 'R' in words falling in क्षुभ्नादि गण. Note that क्षुभ्नाति should have been converted to क्षुभ्णाति by general law. But it is rrtained by this specific rule. And क्षुभ्नादि is an open ended set (not a finite one). आकृतिगणम् in Paninian terminology. Be cautious.

funderburkjim commented 8 years ago

bur:prasIdAmiM:prasIdAmi:p: anusvAra not appropriate for this verb form


funderburkjim commented 8 years ago


Since you mention क्षुभ्नादि गण, this prompts me to ask a long-standing question.

For many entries in MW, there is mentioned that the word is in some gaRa. For instance,

aṁsa-bhāra [p= 1] : m. a burden on the shoulder, (g. bhastrā*di q.v.)

Is there some standard reference work that one may consult to know (at least in the case of a finite gaRa), the words comprising the gaRa? If such a work exists, is it available in scanned form at or elsewhere? Is there a digitized form of such a work?

drdhaval2785 commented 8 years ago

@funderburkjim There is a standard work known as गणपाठः.

It is given with every edition of siddhAntakaumudI. Digitized version - gaNapATha_SLP.txt

Commentary on गणपाठः, namely गणरत्नमहोदधिः

I have started to digitize that commentary right now. The ongoing effort can be seen here

zaaf2 commented 8 years ago

See Arthur Anthony Macdonell, A Sanskrit Grammar for Students, p. xiv:


funderburkjim commented 8 years ago

@drdhaval2785 and @zaaf2 Thank you for the references. I appreciate your knowledgeable contributions.

funderburkjim commented 8 years ago

Errors involving 'ng' instead of 'Ng'.

Most of these occur in PUI, but the cases are subclassified as follows

In a few cases, the Ng spelling is confirmed by headwords in other dictionaries, and the 'ng' spelling was not in PUI. These were examined individually.

PD:angrahaRa:aNgrahaRa:n: aNgrahaRa also in PD.  Here 'an-grahaRa' is a grammatical term
CCS:nyanga:nyaNga:t: Typo confirmed. nyaNga Confirmed by cae,ccs,md,mw,mw72,pw,pwg,sch,shs,vcp,wil,yat,ap90,ap
MCI:sarvasAranga:sarvasAraNga:p: Confirmed as print error. sarvasAraNga Confirmed by inm,mw,pe,pw,pwg

In the next 33 cases, the 'ng' spelling occurs in PUI, and the corresponding 'Ng' spelling occurs in one or more other dictionaries. After examining a few of these individually, I decided it is safe to consider all of them to be print errors in PUI.

PUI:KawvAnga:KawvANga:p: KawvANga Confirmed by cae,ccs,ieg,inm,md,mw,pe,pw,pwg,shs,vcp,skd
PUI:KawvAngada:KawvANgada:p: KawvANgada Confirmed by pw
PUI:caturanga:caturaNga:p: caturaNga Confirmed by bhs,cae,ccs,gra,ieg,md,mw,pe,pw,pwg,shs,vcp,wil,yat,skd
PUI:caturangabala:caturaNgabala:p: caturaNgabala Confirmed by bhs,mw,shs
PUI:citrAngada:citrANgada:p: citrANgada Confirmed by cae,ccs,inm,mw,pe,pw,pwg,shs,vcp,wil,yat,skd
PUI:citrAngI:citrANgI:p: citrANgI Confirmed by mw,pe,skd
PUI:janga:jaNga:p: jaNga Confirmed by cae,ccs,md,mw,mw72,pw,pwg,ap
PUI:tangaRa:taNgaRa:p: taNgaRa Confirmed by cae,ccs,mci,md,mw,mw72,pe,pw,pwg
PUI:patanga:pataNga:p: pataNga Confirmed by bop,bur,ieg,inm,pe,pui,shs,vcp,vei,wil,yat,ben,bhs,cae,ccs,gra,md,mw,pw,pwg,sch,skd,ap90,ap
PUI:parizvanga:parizvaNga:p: parizvaNga Confirmed by ben,bop,bur,cae,ccs,mw,mw72,pw,pwg,shs,stc,vcp,wil,yat,ap,skd,ap90
PUI:piSanga:piSaNga:p: piSaNga Confirmed by ap90,ap,ben,bur,cae,ccs,gra,inm,mci,md,mw,mw72,pe,pw,pwg,sch,shs,stc,vcp,vei,wil,yat,skd
PUI:pUrRotsanga:pUrRotsaNga:p: pUrRotsaNga Confirmed by mw,pw,pwg
PUI:Bujanga:BujaNga:p: BujaNga Confirmed by bop,ieg,pe,shs,vcp,wil,yat,ben,cae,ccs,md,mw,pw,pwg,sch,stc,ap90,ap,skd
PUI:Bujangama:BujaNgama:p: BujaNgama Confirmed by shs,vcp,wil,yat,ben,cae,ccs,md,mw,pw,pwg,sch,ap,skd,ap90
PUI:matanga:mataNga:p: mataNga Confirmed by ben,mw,bop,bur,cae,ccs,inm,md,mw72,pe,pw,pwg,sch,shs,stc,vcp,wil,yat,ap,skd,ap90
PUI:matangavApI:mataNgavApI:p: mataNgavApI Confirmed by inm,mci,pw,pwg,vcp,mw
PUI:mangala:maNgala:p: maNgala Confirmed by ap90,acc,ap,ben,bhs,bop,bur,cae,ccs,gra,ieg,inm,md,mw,mw72,pe,pw,pwg,sch,shs,stc,vcp,vei,wil,yat,skd
PUI:mangalaprasTa:maNgalaprasTa:p: maNgalaprasTa Confirmed by mw,pw,pwg
PUI:mangalA:maNgalA:p: maNgalA Confirmed by md,mw,skd
PUI:mAtanga:mAtaNga:p: mAtaNga Confirmed by acc,ben,bhs,bop,bur,cae,ccs,ieg,inm,md,mw72,pe,pw,pwg,sch,shs,stc,vcp,wil,yat,mw,ap90,ap,skd
PUI:mfdanga:mfdaNga:p: mfdaNga Confirmed by mw,ben,bop,bur,cae,ccs,md,mw72,pw,pwg,shs,stc,vcp,wil,yat,ap,skd,ap90
PUI:vanga:vaNga:p: vaNga Confirmed by ben,bhs,bop,bur,cae,ccs,inm,mci,md,mw,mw72,pe,pgn,pw,pwg,sch,shs,stc,vcp,vei,wil,yat,skd,ap
PUI:vangaka:vaNgaka:p: vaNgaka Confirmed by mw,pw
PUI:vizanga:vizaNga:p: vizaNga Confirmed by ben,cae,mw,mw72,pw,pwg
PUI:Sanga:SaNga:p: SaNga Confirmed by cae,ccs,mw,pw,pwg
PUI:SrIranga:SrIraNga:p: SrIraNga Confirmed by ieg,mw,pw,pwg
PUI:zaqangavid:zaqaNgavid:p: zaqaNgavid Confirmed by cae,mw
PUI:satsanga:satsaNga:p: satsaNga Confirmed by cae,ccs,md,mw,pw,pwg,shs,wil,yat
PUI:saptAnga:saptANga:p: saptANga Confirmed by cae,ccs,ieg,md,mw,pw,pwg,shs,wil,yat
PUI:sarvamangalA:sarvamaNgalA:p: sarvamaNgalA Confirmed by acc,bop,mw,vcp,shs,skd,wil,yat
PUI:sarvAngasundarI:sarvANgasundarI:p: sarvANgasundarI Confirmed by acc,mw,pwg
PUI:sumangala:sumaNgala:p: sumaNgala Confirmed by bur,cae,ccs,gra,mw,pw,pwg,shs,stc,wil,yat
PUI:haryanga:haryaNga:p: haryaNga Confirmed by mw,pw,pwg

Next are cases where (a) no confirmation was found and (b) the dictionary was NOT PUI. These were individually checked, with various resolutions, as shown:

VEI:ekayavangAMdama:ekayavaNgAMdama:n: two words: ekayavan gAMdama
ACC:tripAWingovarDanadIkzita:tripAWiNgovarDanadIkzita:n: 3 words tripAWin govarDana dIkzita
MW:durvAsodarpaBanga:durvAsodarpaBaNga:t: confirmed by scan
IEG:pongadyARa:poNgadyARa:n:  the 'pon' part is Tamil for Gold,
ACC:bAlaSAstringorde:bAlaSAstriNgorde:n: two words bAlaSAstrin gorde (name of author)
IEG:bengali:beNgali:n: Anglicized spelling. Should be classified as non-sanskrit word
IEG:vIracampanguligE:vIracampaNguligE:n: proper name. Not sure if Sanskrit word
VEI:saMvartaAngirasa:saMvartaANgirasa:p: proper name. two words saMvarta + ANgirasa  (Angirasa seems print err.)
IEG:sarang:saraNg:n: Prob. an Anglicization, and thus not a Sanskrit word
IEG:sarAhang:sarAhaNg:n: Probably not a Sanskrit word
IEG:sarhang:sarhaNg:n: Stated to be a Persian word
MW:suviBaktAnavadyAngI:suviBaktAnavadyANgI:t: scan confirms a typo

The final subgroup consists of (a) headwords only in PUI, and (b) with no automatic confirmation of the NG spelling. These were examined individually. In many cases, the word was seen to be a compound of known words. All are classified as print errors.

PUI:aTarvAngIrasI:aTarvANgirasI:p:  confirm PD. Also change 'gI' to 'gi'
PUI:digangana:digaNgana:p: cpd diS+aNgana
PUI:piSangavarRa:piSaNgavarRa:p: compound piSaNga-varRa
PUI:pratyangirasayogA:pratyaNgirasayogA:p: compound pratyaNgirasa-yogA (not sure why long A)
PUI:plavangamAtanga:plavaNgamAtaNga:p: compound plavaNga-mAtaNga
PUI:matangapadam:mataNgapadam:p: compound mAtaNga-padam
PUI:matangavanam:mataNgavanam:p: compound mAtaNga-vanam
PUI:mAtulangasTalI:mAtulaNgasTalI:p: compound mAtulaNga-sTalI
PUI:yozitsanga:yozitsaNga:p: compound yozit-saNga
PUI:rangam:raNgam:p:  raNga confirmed by many dictionaries. Not sure why the ending 'm'
PUI:lakzmIranganA:lakzmIraNganA:p: probable compound lakzmI-raNganA, though exact form raNganA not found
PUI:SivAnangavallaBA:SivAnandavallaBA:p:  SivAnaNga doesn't make sense. So assume author meant  SivAnanda ?
PUI:sarvamangalakArinI:sarvamaNgalakAriRI:p: compound sarvamaNgala-kAriRI  (note kArini changed to ..RI)..
PUI:sopasangas:sopasaNga:p: Also dropped final 's', which seems to be Anglicized plural (janapada)
funderburkjim commented 8 years ago

A similar study was made of headwords with 'nk' in the spelling. Usually, one expects 'Nk'.

Here are the cases determined to legitimately be spelled 'nk':

ACC:anantayajvankavIyasAtABawwa:anantayajvaNkavIyasAtABawwa:n: multiple words = ananta yajvan kavIyasAtABawwa
PD:ankAra:aNkAra:n: A grammatical term. aNkAra Confirmed by pd,mw,mw72
PD:ankArAdi:aNkArAdi:n:  A grammatical term.  aNkArAdi Confirmed by pd
PD:ankArAnta:aNkArAnta:n: A grammatical term.
VEI:aBipratArinkAkzaseni:aBipratAriNkAkzaseni:n: multiple words = Abhi-pratārin Kākṣa-seni 
ACC:uttarIyakarmankARvIya:uttarIyakarmaNkARvIya:n: multiple words =  uttarIyakarman kARvIya
IEG:UrpaddinkAqi:UrpaddiNkAqi:n: probably not a Sanskrit word.
IEG:tUnk:tUNk:n: probably not a Sanskrit word. (Jain)
IEG:paqanka:paqaNka:n: A Tamil word
VEI:pAkasTAmankOrayARa:pAkasTAmaNkOrayARa:n: multiple words = Pāka-sthāman Kaurayāṇa 
ACC:bAlaSAstrinkAgalakara:bAlaSAstriNkAgalakara:n: multiple words = bAlaSAstrin kAgalakara
ACC:BogakarmankASmIra:BogakarmaNkASmIra:n: multiple words = Bogakarman kASmIra
IEG:manEmeyppAnkollumirE:manEmeyppANkollumirE:n: a Tamil word ?
ACC:yaSasvinkavi:yaSasviNkavi:n: multiple words = yaSasvin kavi 
VEI:vicArinkAbanDi:vicAriNkAbanDi:n: multiple words = Vi-cārin Kābandhi
ACC:virUpAkzaSarmankavikaRWABaraRaAcArya:virUpAkzaSarmaNkavikaRWABaraRaAcArya:n:multiple words = virUpAkza Sarman kavikaRWABaraRa AcArya
IEG:vIrapaYcAlankASu:vIrapaYcAlaNkASu:n: A Tamil word
VEI:sutvankEriSiBArgAyaRa:sutvaNkEriSiBArgAyaRa:n: multiple words =  Sutvan Kairiśi Bhārgāyaṇa

Here are the ones believed to be mis-spelled:

PE:cenkaRRarAja:ceNkaRRarAja:p: Spelled as 'nK' under jambukeSvara. May be a non-Sanskrit word
PUI:lankA:laNkA:p: laNkA Confirmed by ap,ben,bur,cae,ccs,inm,mci,md,mw,pe,shs,skd,stc,vcp,wil,yat,ap90
PUI:lankAkzi:laNkAkzi:p:  no confirmation. possible compound of laNka+akzi
PUI:vIryavAnkftamjaya:vIryavANkftaYjaya:p: possibly two words (vIryavAn kftaYjaya). Note also 'mjaya' changed to Yjaya.
PUI:SAlankAyana:SAlaNkAyana:p: SAlaNkAyana Confirmed by cae,ccs,mw,mw72,pw,pwg,shs,stc,vcp,vei,wil,yat,skd
IEG:SASukAniwanka:SASukAniwaNka:p:  waNka is Sanskrit word (hatchet).  See SASukAni  in IEG for waNka spelling.
MW:sarvaSankA:sarvaSaNkA:t: confirmed typo
PUI:sAlankAyana:sAlaNkAyana:p: sAlaNkAyana Confirmed by pe,pw
funderburkjim commented 8 years ago

To continue the theme, checked 'nG' and found none.

Also checked 'nK' and found two, both digitization errors:

STC:InKana:INKana:t: Typo confirmed. INKana Confirmed by cae,ccs,md,mw,mw72,pw,pwg
CCS:paYcanKa:paYcaNaKa:t: Typo confirmed
funderburkjim commented 8 years ago

Will be away for several days.

funderburkjim commented 8 years ago

There is a sandhi rule which Antoine states this way, and calls (in SLP1-lingo) 'n-R' sandhi:

When in the same word n is preceded by f,F,r, or z and followed by a vowel or n,m,y, or v, it is changed to R. The rules applies even which the n is separated from the preceding f,F,r, or z by several letters, provided those intervening letters be vowels, gutturals, labials, or y,v,h, or M.

I think this is a statment of Panini sutra 8.4.11.

This situation often arises in the formation of the feminine of adjectives ending in in. For instance, the feminine of karmin is karmiRI. The procedure is to add an I to karmin, yielding kar_mi_nI which then becomes karmiRI, since the intervening letters mi are labial-vowel.

In a python program, the following regex can find potential errons in headwords ending in 'nI' :


However, these are not always errors due to the wiggle-room provided by the in the same word condition. Application of this condition appears to require judgment; one area of judgment involves cases where there is a prefix (like pra, pari) preceding the word ending in 'nI'.

The following sections show results of an analysis of headwords from the various dictionaries made on the basis of this rule.

funderburkjim commented 8 years ago

These spelling changes not only seem correct by the sandhi rule, but the change of 'nI' to 'RI' is also confirmed by 2 or more dictionaries.

CCS:jArinI:jAriRI:t: jAriRI Confirmed by ap,ap90,cae,gra,md,mw,mw72,pw,pwg
BUR:jfmBinI:jfmBiRI:p: jfmBiRI Confirmed by mw,skd,vcp
INM:dAkzAyanI:dAkzAyaRI:p: dAkzAyaRI Confirmed by ap,ap90,mci,mw,skd,vcp
MW:DAtutaraMginI:DAtutaraMgiRI:p: DAtutaraNgiRI Confirmed by pw,acc
PUI:DArinI:DAriRI:p: DAriRI Confirmed by ap,ap90,mw,pe,shs,skd,wil,yat
ACC:BaktitaraNginI:BaktitaraNgiRI:p: BaktitaraNgiRI Confirmed by mw,pw,pwg
PUI:yakzinI:yakziRI:p: yakziRI Confirmed by ap,ap90,ben,inm,mw,pe,shs,skd,stc,wil,yat
PE:lohitAranI:lohitAraRI:p: lohitAraRI Confirmed by inm,mw,pw
ACC:SAktAnandataraNginI:SAktAnandataraNgiRI:p: SAktAnandataraNgiRI Confirmed by mw,pwg,acc
PUI:sarpinI:sarpiRI:p: sarpiRI Confirmed by ap,ap90,mw,skd,vcp 
PUI:sarvamangalakArinI:sarvamangalakAriRI:p: kAriRI confirmed by AP,AP90,BOP,MW,SCH,SKD,VCP,YAT
PUI:sarvaviGnanivArinI:sarvaviGnanivAriRI:p: 'vAriRI' confirmed under kARqa-vAriRI in MW,PW,PWG,SKD,VCP
PUI:sarvasaMkzoBinI:sarvasaMkzoBiRI:p: kzoBiRI confirmed by MW, PW

These changes of nI to RI are confirmed only in one other dictionary. Most of the incorrect 'nI' forms are in PUI.

PUI:kfttikAcArinI:kfttikAcAriRI:p: MW confirms cAriRI
VCP:garBopaGAtrinI:garBopaGAtinI:p:  Erroneous 'r'. Confirm MW garBopaGAtinI
PUI:citrarUpinI:citrarUpiRI:p: rupiRI is correct, confirmed by MW
MW:DarminI:DarmiRI:t: DarmiRI Confirmed by skd
BUR:niriNganI:niriNginI:t: inI confirmed by MW. Why not 'iRI' ?
MW:puRqarIkinI:puRqarIkiRI:p: puRqarIkiRI Confirmed by sch.  Which is right?
IEG:BojanaAkzayanI:BojanAkzayiRI:p: ? Bojana+akzayiRI.  Note change 'ya' to 'yi'
PUI:yantrinI:yantriRI:t: yantriRI Confirmed by mw
PUI:rakzAvaDArinI:rakzAvaDAriRI:p: DAriRI confirmed by MW.
PUI:vidrAvinI:vidrAviRI:p: vidrAviRI Confirmed by MW under vidrAvin
MW:visarpinI:visarpiRI:p: Print does not specify under visarpin. However, sarpiRI is confirmed under sarpin.
PUI:vedarUpinI:vedarUpiRI:p:  rUpiRI confirmed in MW under rUpin
PUI:SrIcakrarUpinI:SrIcakrarUpiRI:p: rUpiRI confirmed in MW under rUpin
PUI:sarvarakzAsvarUpinI:sarvarakzAsvarUpiRI:p:  rUpiRI confirmed in MW under rUpin
PUI:sarvavidrAvinI:sarvavidrAviRI:p: vidrAviRI Confirmed by MW under vidrAvin
PUI:svarUpinI:svarUpiRI:p:  rUpiRI confirmed in MW under rUpin
PUI:hfdAkarzaRarUpinI:hfdAkarzaRarUpiRI:p:  rUpiRI confirmed in MW under rUpin
funderburkjim commented 8 years ago

In these cases, the 'nI' ending is judged correct since it occurs in a different compound pada (subword) than the preceding z,r,f,F

PD:atyantavizaGnI:atyantavizaGRI:n: viza-GnI  (different syllables)  
PD:atrikAminI:atrikAmiRI:n: atri-kAminI  (different syllables)
PD:atrihAyanI:atrihAyaRI:n:  a-tri-hAyanI  (different syllables)
PD:adrivAhinI:adrivAhiRI:n: adri-vAhiRI  (different syllables)
PD:aDyayanAnantaraBAvinI:aDyayanAnantaraBAviRI:n: aDyayana-anantara-BAvinI  (different syllables)
PD:aniyatapuruzagAminI:aniyatapuruzagAmiRI:n: puruza-gAminI  (different syllables)
PD:anurUpaBartfgAminI:anurUpaBartfgAmiRI:n: Bartf-gAminI  (different syllables)
PD:anuzWAnaviSezopayoginI:anuzWAnaviSezopayogiRI:n: viSeza -upayoginI  (different syllables)
PD:anekapuruzagAminI:anekapuruzagAmiRI:n: puruza-gAminI  (different syllables)
PD:antaHpurakAminI:antaHpurakAmiRI:n: pura-kAminI  (different syllables)
PD:antaHpuragAminI:antaHpuragAmiRI:n: pura-gAminI  (different syllables)
SKD:krimiGnI:krimiGRI:n: krimi-GnI  (different syllables)
MW:tAmbUlakaraNkavAhinI:tAmbUlakaraNkavAhiRI:n: karaNka-vAhinI  (different syllables)
PE:pramohinI:pramohiRI:n: 'nI' confirmed in PWG under 'pramohin'.  (different syllables)
MW:bahizpavamAnI:bahizpavamARI:n: bahiz-pavamAnI  (different syllables)
SKD:brahmayonI:brahmayoRI:n: brahma-yonI  (different syllables)
SCH:mayUravAhinI:mayUravAhiRI:n: mayUra-vAhinI  (different syllables)
MW:rAtryahanI:rAtryahaRI:n: rAtri+ahanI  (different syllables)
MW:rAmaBaginI:rAmaBagiRI:n: rAma-BaginI  (different syllables)
BHS:vAriyoginI:vAriyogiRI:n: vAri-yoginI  (different syllables)
MCI:vfzaBaNginI:vfzaBaNgiRI:n: vfza-BaNginI (different syllables)
PWG:SakrAgnI:SakrAgRI:n: Sakra-agnI  (different syllables)
PUI:sarvayonI:sarvayoRI:n: sarva-yonI  (different syllables)
MCI:sahasraGnI:sahasraGRI:n: sahasra-GnI  (different syllables)
SCH:suparvavAhinI:suparvavAhiRI:n: suparva-vAhinI  (different syllables)

These are questionable, also marked as no change.

PD:antaHprapAkinI:antaHprapAkiRI:n: ? Not sure of reason here  (different syllables)
PD:anDrAvanI:anDrAvaRI:n: ? anDrA-vanI  (different syllables)
IEG:bArahgAnI:bArahgARI:n: A Sanskrit word?  (different syllables)
IEG:sazGAnI:sazGARI:n: ? Is this Sanskrit?
PUI:parikampinI:parikampiRI:n:  ? Is prefix 'pari' considered a different 'word'
PD:antarnI:antarRI:n: antarRI This is a verb - PD gives both forms. Why ?
PW:karaNkinI:karaNkiRI:n: karaNkiRI Confirmed by mw  ? Which is right?

These are marked as no change for various reasons.

IEG:akzayanI:akzayaRI:n: 'nI' intentionally presented as alternate form
STC:nirnI:nirRI:n: In stc, this is a referential headword for nirRI. nirRI Confirmed by ap,ap90,mw,mw72,stc
VCP:bazkayinI:bazkayiRI:n: VCP shows both forms (nI and RI) as alternates. bazkayiRI Confirmed by mw,mw72,pwg,skd
MW:srOGnI:srOGRI:n: related to proper name
MW:hariBAvinI:hariBAviRI:n: MW gives both spellings. skd confirms hariBAviRI

This one deserves further study. The 'nI' forms seems reasonable, as the 'r' in 'garga' occurs in a different pada of the compound. However, there is apparently some reference in Patanjali commentary that also justifies the 'RI' form, but I have not followed that lead (Not sure how to find this Patanjali commentary reference).

MW:gargaBaginI:gargaBagiRI:n: MW has both forms. the 'RI' form references Pan 8.4.11 Pat.
funderburkjim commented 8 years ago

Will consider this issue finished, and begin installation of corrections.

gasyoun commented 8 years ago

For Patanjali we need Dhaval.

funderburkjim commented 8 years ago

These corrections now installed. 147 changes in about 23 dictionaries.

gasyoun commented 8 years ago

The r'([fFrz][aAiIuUfFxXeEoOkKgGNpPbBmyvhM]*)nI$' regex is something I was speaking with Dhaval long ago (as a sandhi tool applied showing up un-sandhi form). I only spoke, you implemented it - not only implemented, but removed the false positives. It's amazing, I'm your fan. Not only you do the theoretical work, you do all the practical implementing, checking and even more - the stats. Even the checking and stats would be enough to improve your karma.

funderburkjim commented 8 years ago

@gasyoun Thanks for encouraging words.