sanskrit-lexicon / CORRECTIONS

Correction history for Cologne Sanskrit Lexicon
8 stars 5 forks source link

SCH corrections from faultfinder list #51

Closed funderburkjim closed 9 years ago

funderburkjim commented 9 years ago

This issue is devoted corrections to headwords from the SCH dictionary.

The starting list contains 180 headwords, and was obtained from AllvsMW.txt in SanskritSpellCheck as all lines matching the regular expression:

:SCH$

There are no duplicates, and no r-x-x words. After listing headword corrections, a list of the remaining headwords (those not requiring correction) from this list will be shown.

drdhaval2785 commented 9 years ago

Let me help you here @funderburkjim .

drdhaval2785 commented 9 years ago

1 AmUfDAntam - आमूऋधान्तम् -> AmUrDAntam - आमूर्धान्तम्

capture

drdhaval2785 commented 9 years ago

2 SAate - शाअते -> SAlate - शालते

capture

drdhaval2785 commented 9 years ago

3 guQacaraRa - गुढचरण -> gUQacaraRa - गूढचरण

capture

gasyoun commented 9 years ago

2 Maybe we should have SAl(ate) as well, what do you think? That way we could extract a list of roots later.

drdhaval2785 commented 9 years ago

Don't we have something like key2 in there ?

gasyoun commented 9 years ago

We have key2, it contains all the meta characters like the degree sign and *, which we see in 3rd case.

Update: Flag it as separate issue. It is deeper than correction. (Dhaval)

drdhaval2785 commented 9 years ago

4 atiyAcnA - अतियाच्ना ??? Seems perfect as per scan. But it is very odd to have this form. Possible print error. yAcYA would be there in almost all dictionaries. See yAcYA:AP,AP90,BOP,CAE,CCS,MD,MW,MW72,PW,PWG,SKD,VCP,YAT

drdhaval2785 commented 9 years ago

5 aduHspfzta - अदुःस्पृष्त -> aduHspfzwa - अदुःस्पृष्ट

capture

drdhaval2785 commented 9 years ago

6 apadAYta - अपदाञ्त -> apadAnta - अपदान्त

capture

drdhaval2785 commented 9 years ago

7 apratizTAyuka - अप्रतिष्थायुक -> apratizWAyuka - अप्रतिष्ठायुक

capture

drdhaval2785 commented 9 years ago

8 aBiruqgatA - अभिरुड्गता -> aBirudgatA - अभिरुद्गता

capture

drdhaval2785 commented 9 years ago

9 amRAs - अम्णास् -> amRas - अम्णस् This is an accent mark erroneously entered as A. Will have to look for such other occurrences if any in this text later on.

capture

drdhaval2785 commented 9 years ago

10 avapfzTI - अवपृष्थी -> avapfzWI - अवपृष्ठी

capture

gasyoun commented 9 years ago

9 accent as A mistake might be unique for SCH, as it's the only other (BHS being the other one) dictionary in IAST.

drdhaval2785 commented 9 years ago

11 avfztikAma - अवृष्तिकाम -> avfzwikAma - अवृष्टिकाम

capture

drdhaval2785 commented 9 years ago

12 asamtfRRa - असम्तृण्ण Unable to locate it in http://www.sanskrit-lexicon.uni-koeln.de/scans/SCHScan/2014/web/webtc/servepdf.php?page=085 Search properly and paste

drdhaval2785 commented 9 years ago

13 asiyaswi - असियस्टि -> asiyazwi - असियष्टि

capture

drdhaval2785 commented 9 years ago

14 inqindirA - इन्डिन्दिरा -> indindirA - इन्दिन्दिरा

capture

gasyoun commented 9 years ago

12 asamtṛṇṇa [L=6177] [p= 085-1] Adj. nicht aneinander befestigt , Jaim. 3 , 3 , 24.

There is no such word on page 085, nor 084 or 086. Checked with OCR as well. The bad part is that "Adj. nicht aneinander befestigt" the German meaning does makes sense. Searched by it as well, failed.

From the same source we have: अकर्मकरण akarmakaraṇa [L=50] [p= 001-3] Adj. = 2. akaraṇa oben , Jaim. 3 , 8 , 15. असंतर्दन asaṃtardana [L=6129] [p= 084-2] n. das Nichtaneinanderbefestigen , Komm. zu Jaim. 3 , 3 , 24. असम्तृण्ण asamtṛṇṇa [L=6177] [p= 085-1] Adj. nicht aneinander befestigt , Jaim. 3 , 3 , 24.

drdhaval2785 commented 9 years ago

15 kARdarika - काण्दरिक -> kARqarika - काण्डरिक

capture

drdhaval2785 commented 9 years ago

12 Is there something like supplementary where we should search. Like MW ?

drdhaval2785 commented 9 years ago

16 kubjottara - कुब्जोत्तर -> kubjottarA - कुब्जोत्तरा

capture

This takes us to another method of identifying errors when the word ends in 'a' and is 'feminine'.

gasyoun commented 9 years ago

12 yes, it's called Nachtrag (=supplement) in German on page 395 of http://www.sanskrit-lexicon.uni-koeln.de/scans/SCHScan/2014/downloads/sch_bookmark.pdf Nor is the Jaim. source listed in the abbreviations.

asam

gasyoun commented 9 years ago

14 case interesting as well. There is indindira [L=7637] [p= 105-1] m. Biene , S I , 121 , 3 [Lesart unsicher bis auf das Genus!] being indindira as separate entry and *indindira as a variation entry in indindirA. As we see *indindira is out of the game now. In this case I hardly understand why we would want to have a hypothetical word (marked wiht *) when we have a real one. But I guess there are auch (=as well) cases that might introduce new words as well. Possible auch cases:

Another interesting word is gedruckt (=printed as, 85 cases), so betont (=with such a strange accent, 34 cases) and lies (=to be read as, 257 cases).

drdhaval2785 commented 9 years ago

17 kupyADyakza - कुप्याध्यक्ष - SCH 18 kupyopajIvin - कुप्योपजीविन् - SCH

Not able to trace them here.

drdhaval2785 commented 9 years ago

19 KaRdenduSiromaRi - खण्देन्दुशिरोमणि -> KaRqenduSiromaRi - खण्डेन्दुशिरोमणि

capture

drdhaval2785 commented 9 years ago

20 GARapiNyAka - घाणपिङ्याक Not sure about the correct reading. But very improbable word.

Not even a single word other than this matches 'Ny' combination. The only word matching is aNyanta which is grammatical term.

drdhaval2785 commented 9 years ago

21 jUwikAbaNDa - जूटिकाबङ्ध -> jUwikAbanDa - जूटिकाबन्ध

capture

drdhaval2785 commented 9 years ago

22 daksiRottara - दक्सिणोत्तर -> dakziRottara - दक्षिणोत्तर print error Also the page linked is 207a whereas it actually is on 207.

capture The next entry is capture

So the order mandates dakziR

drdhaval2785 commented 9 years ago

23 durvyavahfzti - दुर्व्यवहृष्ति -> durvyavahfti - दुर्व्यवहृति

capture

drdhaval2785 commented 9 years ago

24 Not a headword error. But SLP1 seems to be having some accent. Not seen before. capture

I guess we want dOScitya. not s with an accent mark - right ? We need to check for other occurences as well

In pāreskanDam also the same issue - D converted but ā remains.

So this calls for corrections everywhere in this dictionary. Either proper SLP1 or proper IAST. Not something in between.

drdhaval2785 commented 9 years ago

25 nizpUtigaRDika - निष्पूतिगण्धिक -> nizpUtiganDika - निष्पूतिगन्धिक

capture

drdhaval2785 commented 9 years ago

26 pfzwimAmsAdana - पृष्टिमाम्सादन -> pfzwimAMsAdana - पृष्टिमांसादन Print error

capture

gasyoun commented 9 years ago

24 There can be no s with an accent mark, it's just a coincidence that it looks similar to it. You are right, it's a mix of SLP1 and IAST. Because it was Anglicized Sanskrit before, it was never pure something, it was a mix originally.

drdhaval2785 commented 9 years ago

27 prARmuKAYcana - प्राण्मुखाञ्चन -> prANmuKAYcana - प्राङ्मुखाञ्चन

capture

drdhaval2785 commented 9 years ago

28 BAmkfti - भाम्कृति -> BAMkfti - भांकृति

capture

29 mAmsavarRa - माम्सवर्ण -> mAMsavarRa - मांसवर्ण

capture

30 luYcAlunca - लुञ्चालुन्च -> luYcAluYca - लुञ्चालुञ्च

capture

drdhaval2785 commented 9 years ago

31 haRTA - हण्था -> haRWA - हण्ठा There is no other dictionary with this entry, but RT is not possible. It gets converted to RW by grammatical rule. So possible print error.

capture

32 hAkazta - हाकष्त -> hAkazwa - हाकष्ट Print error capture

drdhaval2785 commented 9 years ago

33 aDvmuKa - अध्व्मुख -> aDomuKa - अधोमुख

capture

34 alaksmIka - अलक्स्मीक -> alakzmIka - अलक्ष्मीक Print error capture

35 cuRwCedane - चुण्ट्छेदने -> cuRw is the only headword. Cedane is explanation.

drdhaval2785 commented 9 years ago

36 nirBrantitA - निर्भ्रन्तिता -> nirBrAntitA - निर्भ्रान्तिता

capture

drdhaval2785 commented 9 years ago

37 bahirvrESravaRa - बहिर्व्रैश्रवण -> bahirvESravaRa - बहिर्वैश्रवण image [ejf changed image]

See bahirvESravaRa:MW

drdhaval2785 commented 9 years ago

38 yAjYvalkyavAdDali - याज्ञ्वल्क्यवाद्धलि -> yAjYavalkyavAdDali - याज्ञवल्क्यवाद्धलि image [ejf changed image]

39 anAkANksya - अनाकाङ्क्स्य -> anAkANkzya - अनाकाङ्क्ष्य Print error capture

drdhaval2785 commented 9 years ago

40 viSvadrSvan - विश्वद्र्श्वन् -> viSvadfSvan - विश्वदृश्वन् ?? EJF: Agree. Print error, missing dot under 'r'. capture

41 kkuqula - क्कुडुल -> Kuqula - खुडुल print error capture

42 ksvelA - क्स्वेला -> kzvelA - क्ष्वेला Print error capture

43 apapivAms - अपपिवाम्स् -> apapivaMs - अपपिवंस्‌ capture

44 sidD - सिद्ध् Please check whether this is a proper headword or not. I see it being given in explanation of sidDi with some [ ] around it. capture

45 hinqanaka - हिन्डनक -> hiRqanaka - हिण्डनक Print error capture

This completes the list of corrections suggested in this thread.

drdhaval2785 commented 9 years ago

The words treated as OK have been checked from the dictionary entries and found OK. So no need to duplicate the work. But if you find time, please see if there has been any oversight.

iqenyakratu - इडेन्यक्रतु - SCH
oQopitam - ओढोपितम् - SCH
koCarabA - कोछरबा - SCH
gaRqaberuRqanfsiMhamantra - गण्डबेरुण्डनृसिंहमन्त्र - SCH
jiGitsA - जिघित्सा - SCH
dodUyamAna - दोदूयमान - SCH
nyUnonnata - न्यूनोन्नत - SCH
pIlU - पीलू - SCH
BUvezwa - भूवेष्ट - SCH
meDUlaka - मेधूलक - SCH
mocowa - मोचोट - SCH
moQerapura - मोढेरपुर - SCH
vAraRavusA - वारणवुसा - SCH
SOcopakaraRa - शौचोपकरण - SCH
SrICattrakarI - श्रीछत्त्रकरी - SCH
aMSumadBedasaMgraha - अंशुमद्भेदसंग्रह - SCH
akulasPIti - अकुलस्फीति - SCH
aYSera - अञ्शेर - SCH
aDijyI - अधिज्यी - SCH
anApnuvant - अनाप्नुवन्त् - SCH
anErBftya - अनैर्भृत्य - SCH
apunarlaBya - अपुनर्लभ्य - SCH
apCawA - अप्छटा - SCH
aploza - अप्लोष - SCH
apSuzka - अप्शुष्क - SCH
abBogIna - अब्भोगीन - SCH
arhaccUqAmaRi - अर्हच्चूडामणि - SCH
avirAwsaMpanna - अविराट्संपन्न - SCH
AsTI - आस्थी - SCH
AsPUrjita - आस्फूर्जित - SCH
uccodarki - उच्चोदर्कि - SCH
uYJA - उञ्झा - SCH
uRqUka - उण्डूक - SCH
utTIBavana - उत्थीभवन - SCH
utpeza - उत्पेष - SCH
utpezwar - उत्पेष्टर् - SCH
udaktUla - उदक्तूल - SCH
uddEva - उद्दैव - SCH
udBI - उद्भी - SCH
kakupkumBin - ककुप्कुम्भिन् - SCH
kakuppAla - ककुप्पाल - SCH
kakupsImantinI - ककुप्सीमन्तिनी - SCH
kaccolaka - कच्चोलक - SCH
karkUra - कर्कूर - SCH
kalpOzaDasevAdiprakAra - कल्पौषधसेवादिप्रकार - SCH
kalvowaka - कल्वोटक - SCH
kARvopanizad - काण्वोपनिषद् - SCH
kukkU - कुक्कू - SCH
kukkoka - कुक्कोक - SCH
kOzWeyaka - कौष्ठेयक - SCH
ganDERa - गन्धैण - SCH
goRqita - गोण्डित - SCH
GurGUraka - घुर्घूरक - SCH
cidBiwa - चिद्भिट - SCH
cEkkasa - चैक्कस - SCH
wuppikA - टुप्पिका - SCH
QuRQOlI - ढुण्ढौली - SCH
dArSI - दार्शी - SCH
dfzadBedaka - दृषद्भेदक - SCH
dErGikeya - दैर्घिकेय - SCH
dOgDikasasya - दौग्धिकसस्य - SCH
dOScittya - दौश्चित्त्य - SCH
DigjAtIya - धिग्जातीय - SCH
nimnIkfta - निम्नीकृत - SCH
niScotana - निश्चोतन - SCH
nErdAkziRya - नैर्दाक्षिण्य - SCH
nyaggAmin - न्यग्गामिन् - SCH
pariDAsyE - परिधास्यै - SCH
pAreskanDam - पारेस्कन्धम् - SCH
pAllIkya - पाल्लीक्य - SCH
piYCikA - पिञ्छिका - SCH
piYCola - पिञ्छोल - SCH
piRqeSUra - पिण्डेशूर - SCH
pippoda - पिप्पोद - SCH
peRWAsTAna - पेण्ठास्थान - SCH
ponti - पोन्ति - SCH
prAjjURaka - प्राज्जूणक - SCH
babbUla - बब्बूल - SCH
bimbOzWI - बिम्बौष्ठी - SCH
bOdbuda - बौद्बुद - SCH
BAtkUwa - भात्कूट - SCH
madyIBU - मद्यीभू - SCH
mAYjU - माञ्जू - SCH
miRWa - मिण्ठ - SCH
muktOzWatA - मुक्तौष्ठता - SCH
musfRWi - मुसृण्ठि - SCH
melyAwI - मेल्याटी - SCH
mOnyA - मौन्या - SCH
yaSorGa - यशोर्घ - SCH
riYColikA - रिञ्छोलिका - SCH
riYColI - रिञ्छोली - SCH
luYCi - लुञ्छि - SCH
leRqikA - लेण्डिका - SCH
vAgjIva - वाग्जीव - SCH
vAgjIvana - वाग्जीवन - SCH
vArDIRasa - वार्धीणस - SCH
SalyIBavati - शल्यीभवति - SCH
SunDi - शुन्धि - SCH
SOWya - शौठ्य - SCH
zaqBakzaRa - षड्भक्षण - SCH
zARmuKa - षाण्मुख - SCH
sarasvatIdrohin - सरस्वतीद्रोहिन् - SCH
saritsUnu - सरित्सूनु - SCH
salleKanA - सल्लेखना - SCH
sAntU - सान्तू - SCH
sItka - सीत्क - SCH
sUdgata - सूद्गत - SCH
sTUlAByUrRa - स्थूलाभ्यूर्ण - SCH
svarnATa - स्वर्नाथ - SCH
svarnArI - स्वर्नारी - SCH
akzyupanizad - अक्ष्युपनिषद् - SCH
aBuNkzita - अभुङ्क्षित - SCH
artni - अर्त्नि - SCH
Avraskya - आव्रस्क्य - SCH
Izatsvinna - ईषत्स्विन्न - SCH
ucculumpyatA - उच्चुलुम्प्यता - SCH
uccErnyAya - उच्चैर्न्याय - SCH
utkzIba - उत्क्षीब - SCH
kuRqyAgrIya - कुण्ड्याग्रीय - SCH
guluYcCA - गुलुञ्च्छा - SCH
cORQya - चौण्ढ्य - SCH
tintrIRIkA - तिन्त्रीणीका - SCH
tottropavAhya - तोत्त्रोपवाह्य - SCH
digdvipa - दिग्द्विप - SCH
dordru - दोर्द्रु - SCH
niScyota - निश्च्योत - SCH
nistviz - निस्त्विष् - SCH
partvA - पर्त्वा - SCH
meRqya - मेण्ड्य - SCH
vEDrya - वैध्र्य - SCH
tArkzyIya - तार्क्ष्यीय - SCH
DUNkzRA - धूङ्क्ष्णा - SCH
ptA - प्ता - SCH
tarq - तर्ड् - SCH
gasyoun commented 9 years ago

1) [] in SCH is not a problem, so sidDi with some [ ] is normal. 2) matches 'Ny' combination - wonder how many more unique combinations are there. With an exclusion list, because grammar terms should not be counted, as you said. 3) 12 rajjudhāna n. die Stelle am Halse eines Haustieres, an der das Binde- seil befestigt wird, Kaus. 44, 23. - the only place in dictionary where I found the German word befestigt

funderburkjim commented 9 years ago

Modified sch system to use slp1. Will install corrections tomorrow.

It was a nice surprise to see all the corrections done. :+1:

gasyoun commented 9 years ago

slp1 + sch = fun.

funderburkjim commented 9 years ago

Regarding 2 SAate - In sch.txt, here's the digitization (before correction):

.{#SAate#}100{#s4a1(ate)#}¦ auch H 46 , 18 (Ko.). [Schµ25310] €1

General form:
.{#X#}100{#Y#}¦ <body> [Schµ<L>] €1

This is one of the older digitizations, and is peculiar in various ways.

In particular, Thomas appears to have

X is used as key1, and Y is used as key2.

So, key2 IS available.

In this case, Thomas chose to, in essence, remove the parentheses from key2 to get key1. I don't know whether he did this consistently for verbs in SCH.

In the 'recent' digitizations, Thomas does no 'key2->key1' inference, and that task is done by a program (the hw1.py program). However, in the SCH case, hw1 just takes what Thomas provided (i.e., X) as key1, and does no such key2-key1 inference. [I think this is also true for PW and PWG, whose digitizations were done about the same time as SCH.]

It is possible that a better (more useful) key1 would be obtained for SCH by programmatically computing key1 from key2. For instance, SAl is probably a more useful key1 than SAlate.

Since I am not personally very interested in SCH, my enthusiasm for writing this is lukewarm at best.

funderburkjim commented 9 years ago

re 4 atiyAcnA - अतियाच्ना ??? The definition shown 'Zudringlichkeit' google translates to 'intrusiveness', which is consistent with ati + yAcYA. Thus, I think we should go with atiyAcYA, and attribute this to a print error (omitted tilde over 'n').

We did the same sort of inference in 5, where the print forgot the dot below the 't'.

funderburkjim commented 9 years ago

Re 12 asamtfRRa - असम्तृण्ण Found it on preceding page (84-2) after asaMtApa image

Another flaw in the sch.txt digitization is that some lines are out of order. There is a special step in the correction process for SCH to handle this. It currently uses an input file change_line.txt.

move 3259 after 3245
move 14383 after 14381
move 15076 after 15079
move 15077 after 15055
move 15078 after 15063
delete 23079

Since asaMtfRRa is at line 6184 of digitization, and asaMtApa at line 6137, adding the following line

move 6184 after 6137

to change_line.txt should solve the misplacement.

Incidentally, since MW has saMtfRRa, I think the spelling of a-saMtfRRa in SCH is ok.