sanskrit-lexicon / CORRECTIONS

Correction history for Cologne Sanskrit Lexicon
8 stars 5 forks source link

all vs PW 3-grams, part 1 #294

Closed drdhaval2785 closed 8 years ago

drdhaval2785 commented 8 years ago

Examination file - http://sanskrit-lexicon.github.io/CORRECTIONS/ngram/output/html/allvsPW_3.html Total 247 entries to examine.

Corrected file - https://github.com/sanskrit-lexicon/CORRECTIONS/blob/master/ngram/output/corrections/allvsPW_3_corrected.txt

294 and #299 both corrected and uploaded in the above file. Ready for installation.

Not found

ccs:fgiman,3839:fgiman:n:fgi
ccs:fgimaya,3840:fgimaya:n:fgi
ccs:fgivaDAna,3841:fgivaDAna:n:fgi
ccs:ekOka,4016:ekOka:n:ekO
ccs:krIqAmaheDra,5578:krIqAmaheDra:n:eDr
ccs:kzudraGARiwakA,5745:kzudraGARiwakA:n:Riw
mw:gUDatayA,66346:gUDatayA:n:gUD
ccs:GaRiwan,6586:GaRiwan:n:Riw
ccs:GaRIwaka,6585:GaRIwaka:n:RIw
ccs:jayoDvaja,7466:jayoDvaja:n:oDv
ccs:jYipsati,7897:jYipsati:n:Yip
drdhaval2785 commented 8 years ago

1-10

yat:akftEnas,44428:akftEnas:n:ftE
sch:ajajYivaMs,787:ajajYivaMs:n:Yiv
acc:aRuSabdopanizad,251:aRuSabdopanizad:n:bdo
pui:aTiTi,285:aTiTi:p:aTiTi is alphabetically misordered between two atiTi.
mw:aDyarvuda,4607.1:aDyarvuda:n:vud
bur:anugfhRAmi,608:anugfhRAmi:n:hRA
acc:anumitigranTawIkA,699:anumitigranTawIkA:n:Taw
vcp:apahnUyamAna,2820:apahnUyamAna:n:hnU
mw:apodUh,10407:apodUh:n:odU
bur:aByApnomi,1255:aByApnomi:n:pno
drdhaval2785 commented 8 years ago

11-20

sch:aByupEtos,4310:aByupEtos:n:upE
pui:aMBuDArA,729:aMbuDArA:p:BuD can't come before aMBa. Alphabetic misordering.
mw:avaDes,17454:avaDes:n:Des
mw:avitaTehita,18781:avitaTehita:n:Teh
sch:avEBIdika,5725:avEBIdika:n:BId
inm:aSvisutO,130:aSvisutO:n:utO
mw72:asammoza,7496:asammoza:n:mmo
sch:AkEwaBavEri,6587:AkEwaBavEri:n:AkE
acc:AcArapradIpe,31655:AcArapradIpe:n:Ipe
shs:AQyamBaviRu,5205:AQyamBavizRu:t:iRu
drdhaval2785 commented 8 years ago

21-30

skd:ASvineyO,3903:ASvineyO:n:eyO
mw:AsaPaKAna,27993.1:AsaPaKAna:n:PaK
mw:izwATititva,29578.2:izwAtiTitva:p:wAT
acc:ucCizwasumuKIdevInityArcanaviDi,39494:ucCizwasumuKIdevInityArcanaviDi:n:KId
sch:udgUrayitar,8331:udgUrayitar:n:dgU
bhs:udgfhRAti,3540:udgfhRAti:n:hRA
shs:udvoDana,7142:udboDana:t:dvo
yat:udvorakAma,44906:udvorakAma:n:dvo
bur:upagfhRAmi,3089:upagfhRAmi:n:hRA
mw:upaDOkita,34614:upaQOkita:t:DOk
drdhaval2785 commented 8 years ago

31-40

acc:uBayatomuKIgavIdAnaprayoga,2785:uBayatomuKIgavIdAnaprayoga:n:KIg
acc:uBayatomuKIdAna,2786:uBayatomuKIdAna:n:KId
acc:uBayatomuKIpratigrahaprAyaScitta,2787:uBayatomuKIpratigrahaprAyaScitta:n:KIp
mw:uhlaRa,37936.1:uhlaRa:n:uhl
yat:UjjitacaRuSAsana,44961:UjjitacaRuSAsana:n:Ujj
acc:fkzoccaya,2858:fkzoccaya:n:zoc
mw:ekadehO,39335:ekadehO:n:ehO
acc:ekAdaSyutpattikaTAnaka,31927:ekAdaSyutpattikaTAnaka:n:Syu
acc:ekAdaSyutpattivratodyApanaviDi,31928:ekAdaSyutpattivratodyApanaviDi:n:Syu
acc:ekAdaSyudyApanapadDati,3010:ekAdaSyudyApanapadDati:n:Syu
drdhaval2785 commented 8 years ago

41-50

vcp:kawunizpAba,11450:kawunizpAba:n:pAb
bhs:kaRWeguRa,4369:kaRWeguRa:n:Weg
bhs:kapIdaka,4449:kapIdaka:n:pId
bhs:kamIbala,4467:kamIbala:n:mIb
vcp:karbagatO,12364:karba:t:gatO is meaning of the verb karba.
mw:kalpakeqAra,46092:kalpakeqAra:p:keqAra is not possible. Seems a print smudge.
acc:kalpAnupadapAQA,30357:kalpAnupadapAQA:n:pAQ
acc:kAmadevavawIsArasaMgraha,3760:kAmadevavawIsArasaMgraha:n:wIs
yat:kArabellaka,9510:kAravellaka:p:'b' occurs after 'v'. Alphabetic misordering.
acc:kASIdAsaprahasana,4158:kASIdAsaprahasana:n:SId
gasyoun commented 8 years ago

So many :n: - maybe time to switch to alphabetical ordering? Or you want to finalize one method, @drdhaval2785 ?

drdhaval2785 commented 8 years ago

For trigrams, I have changed methodology. Earlier I used to examine each word's scan. Now I visually inspect the word and open only fishy words, not existing in my Sanskrit knowledge. So will not take much time.

drdhaval2785 commented 8 years ago

51-60

acc:kASyupADyAya,4237:kASyupADyAya:n:Syu
vcp:kuWAwaNkf,13920:kuWAwaNga:t:Meaning has waNga clearly.
acc:kuberakAYjibilvIya,42667:kuberakAYjibilvIya:n:lvI,jib
pui:kumuTi,3191:kumuTi:n:uTi
pui:kurujib,3264:kurujib:n:jib
acc:kUwodDAra,4494:kUwodDAra:n:Uwo
bur:kfRozi,4992:kfRozi:n:Roz
acc:kevalavyatirekigranTarahasya,4932:kevalavyatirekigranTarahasya:n:kig
acc:koneriBawwa,32369:koneriBawwa:n:one
acc:kozWakacintAmaRiwIkA,5123:kozWakacintAmaRiwIkA:n:Riw
acc:kozWISlokapraSnaprakAra,5127:kozWISlokapraSnaprakAra:n:ISl
drdhaval2785 commented 8 years ago

61-70

acc:kozWISlokapraSnaprakAra,5127:kozWISlokapraSnaprakAra:n:ISl
acc:kOkilIsOtrAmaRoprayoga,42853:kOkilIsOtrAmaRIprayoga:t:IsO
pwg:kOkkuwIbarha,119617:kOkkuwIbarha:n:wIb
bhs:kOSIdya,5285:kOSIdya:n:SId
mw:krIqAbiqAlikA,58142.1:krIqAbiqAlikA:n:qAb
mw:kzudracUda,59930:kzudracUqa:t:cUd
shs:KaNgAGAta,12637:KaqgAGAta:t:gAG
bhs:Kuddalaka,5520:Kuddalaka:n:Kud
acc:gaRacaturTIcandradarSanakaTA,5568:gaRacaturTIcandradarSanakaTA:n:TIc
krm:gAstutO,392:gA:y:stutO is explanation.
gasyoun commented 8 years ago

I visually inspect the word and open only fishy words, not existing in my Sanskrit knowledge. So will not take much time.

Right, it's good you do not have to waste time for all of them. I would have to check each one myself.

funderburkjim commented 8 years ago

@drdhaval2785 In closing the other n-gram studies, I noticed this one has not been installed. Is it ready for installation?

drdhaval2785 commented 8 years ago

Not yet ready.

gasyoun commented 8 years ago

Come on Dhaval, give him a chance to install it :ideograph_advantage:

drdhaval2785 commented 8 years ago

I leave the task to superhero Marcis.

Shalu411 commented 8 years ago

Namaste. (Smile..) After long long break- Am back. Hope the wonderful team forgives my absence all this while. Can I have the file in Devanagari? A correction in advance- as an "enter cheers"-- व्यालायुघ >> व्यालायुध ((घ >> ध))

funderburkjim commented 8 years ago

@Shalu411 HI! Nice to 'see' you.

Regarding व्यालायुघ , if this is a PW issue, then I think it has already been solved - Do you agree?

@drdhaval2785 prepared the working papers for this issue, so he would be the best to prepare a Devanagari version for you. Please indicate to Dhaval specifically what you want to be in Devanagari.

Note: While you've been away, we've developed a system for error preparation, which we call a 'standard form'. As two examples from the entries by Dhaval from above:

acc:kozWISlokapraSnaprakAra,5127:kozWISlokapraSnaprakAra:n:ISl
acc:kOkilIsOtrAmaRoprayoga,42853:kOkilIsOtrAmaRIprayoga:t:IsO

Here the form is that there are 5 fields in each entry; the fields are separated by the colon ':' character. Here is brief description of these fields:

Note that the spellings in 2 and 3 are in SLP1. The reason for using SLP1 has to do with the 'installation' of corrections. When someone submits a batch of 'standard form corrections', it is then up to me to install those corrections. I do this with the assistance of various computer programs. And those programs assume that the headwords are spelled in SLP1. SORRY ABOUT THAT ! Hope you can adapt to SLP1 for the standard form.

drdhaval2785 commented 8 years ago

https://github.com/sanskrit-lexicon/CORRECTIONS/commit/60bf174704af80bee29b86255dc3783121ae12b0 This commit has created Devanagari files for all the data in the ngram repositories as per @Shalu411's request.

drdhaval2785 commented 8 years ago

71-80

pe:gilgamIz,2437:gilgamIz:n:ilg
bhs:guluguluyati,5804:guluguluyati:n:luy
inm:gUDavrata,4273:gUQavrata:t:gUD
mw:gfhRAna,66760:gfhRAna:n:hRA
pui:gratadvoca,4434:gratadvoca:n:dvo
vcp:grahavarzAdiPala,18498:grahavarzAdiPala:n:diP
acc:caturdaSyudyApana,32845:caturdaSyudyApana:n:Syu
vcp:cicCiwiNga,19725:cicCiwiNga:n:Ciw
acc:cidGanAnandanATa,32950:cidGanAnandanATa:n:idG
pe:cUdAkarRa,1635:cUqAkarRa:p:cUd
drdhaval2785 commented 8 years ago

81-90

pe:cUdAma,1637:cUdAma:n:cUd
sch:cellamabollana,13448:cellamabollana:n:bol
mw72:cozkUyamARa,52970:cozkUyamARa:n:ozk
bur:cozkUye,6821:cozkUye:n:ozk
stc:jajYivAMs,9450:jajYivAMs:n:Yiv
mw:jambUdvIpeSvara,77270.1:jambUdvIpeSvara:n:Ipe
acc:jalaBedaBAvArTaboDinIwIkA,43515:jalaBedaBAvArTaboDinIwIkA:n:nIw
acc:jAtakapAwIsaMgraha,7994:jAtakapAwIsaMgraha:n:wIs
mw:jATarya,78529:jAWarya:t:jAT
yat:jibAjiva,15805:jibAjiva:n:jib
drdhaval2785 commented 8 years ago

91-100

acc:jIvabrahmEkya,40012:jIvabrahmEkya:n:hmE
gra:jesa,3547:jesa:n:jes
acc:jvAlAmuKistotra,8421:jvAlAmuKIstotra:t:Kis
stc:Jatkfta,9744:Jatkfta:n:Jat
mw:QilI,81502.01:QilI:n:Qil
mw:QillikA,81502.2:QillikA:n:Qil
mw:QillI,81502.1:QillI:n:Qil
skd:RATa,14172:RATa:n:RAT
acc:tattvArTacintAmaRiwIkA,8672:tattvArTacintAmaRiwIkA:n:Riw
acc:tiTyuktiratnAvalI,9053:tiTyuktiratnAvalI:n:Tyu
drdhaval2785 commented 8 years ago

This completes correction submission in this thread.

funderburkjim commented 8 years ago

Added

krm:KawwasaMvaraRe,334:Kawwa:t: saMvaraRe is explanation

similar to the gAstutO instance mentioned above.

drdhaval2785 commented 8 years ago

From two KRM entries gAstutO etc, it seems a quick method to check KRM HWs having > 6 length.

They would be mostly erroneous except a few like daridrA etc. Most of Sanskrit roots (with anubandha) are less than or equal to 6 letters SLP1.

funderburkjim commented 8 years ago

Here's the other word in VCP whose spelling ends in gatO (via advanced search).

vcp:CapagatO,20270:Capa:t:gatO is meaning of the verb Capa
gasyoun commented 8 years ago

SORRY ABOUT THAT ! Hope you can adapt to SLP1 for the standard form.

@funderburkjim she can't. That's why she wants it in Devanagari. Make one more tool that will make it back to SLP1, otherwise we can't hope for @Shalu411 help.

drdhaval2785 commented 8 years ago

In that case, add indempont scripts to convert SLP1 to Devanagari for creating standard form and Devangari to SLP1 before processing standard forms. Should be easily doable for Jim with transcoder.py.

funderburkjim commented 8 years ago

Regarding standard form corrections and Devanagari.

If @Shalu411 (or someone else) submits corrections in standard form EXCEPT that the Sanskrit words are in Devanagari unicode then I'll write an adapter program that handles the situation.

gasyoun commented 8 years ago

Sanskrit words are in Devanagari unicode

She is waiting for @drdhaval2785 tutoring to start.

drdhaval2785 commented 8 years ago

Tutorial given. She may start soon. Will hand-hold whenever needed