sanskrit-lexicon / CORRECTIONS

Correction history for Cologne Sanskrit Lexicon
8 stars 5 forks source link

`o` vs `O` Corrections in PWG, Part 1 #130

Closed zaaf2 closed 8 years ago

zaaf2 commented 8 years ago

This issue is about an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html, generated by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG.

OCR error.

image

zaaf2 commented 8 years ago

False positive. PWG has: image

(…)

image

असूक in AP has another origin and meaning: “असूक a. See असूयक.”

zaaf2 commented 8 years ago

Factual error: image

As mentioned by PW, the Tandya Brahmana 21,2,5 has आच्यादोह : image

According to MW आच्या in आच्यादोह (with ā) comes from the Vedic ind. p. (aka gerund) of आच् (< आ-√अच्), instead of the regular ind. p. with ă आच्य. Here the MW screenshot:

image

Regarding the Vedic ind.p., v. MacDonell, A Vedic Grammar for Students: image

zaaf2 commented 8 years ago

Factual error. As mentioned by PW, the Tandya Brahmana 12,11,15 has आतीषादीय :

image

zaaf2 commented 8 years ago

Factual error, corrected by PWG itself in the section „Verbesserungen und Nachträge“ (vol. 5):

आषाडी [L=9704] [p= 1-0728] (°ढी?) f. N. pr. einer Localität R. 4, 27, 11.

आषाडी [L=67037] [p= 5-1128] zu streichen, da an der angeführten Stelle आषाढी in der gangbaren Bed. zu lesen ist.

gasyoun commented 8 years ago

@funderburkjim is Devanagari OK with you? @zaaf2 amazing work! Wonder how many thousands of mistakes are already documented in the „Verbesserungen und Nachträge“ part. Similar work was integrated only in MW. Never in PWG or any other.

drdhaval2785 commented 8 years ago

Good to see that the raw data is now put to analysis and corrections are pouring in. Good work.

zaaf2 commented 8 years ago

@drdhaval2785 I should have mentioned in the opening of this issue that its object is an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html, generated by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG. I will edit now the opening of this issue to correct this.

zaaf2 commented 8 years ago

False positive.

image

Just another way to write इङ्कार, as in MW: इङ्कार [p= 164] : and इङ्-कृत = हिङ्-कार, हिङ्कृत q.v. [L=28636]

zaaf2 commented 8 years ago

False positive.

image

केशरिन् and केसरिन् are alternative forms of the same word. Cf. MW:

केशरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56139]; m. (ई) a lion MBh. Suṡr. Bhartṛ. &c [L=56140] केसरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56151] m. a lion MBh. Suṡr. Bhartṛ. &c [L=56152]

zaaf2 commented 8 years ago

False positive

PWG: image

MW: एलाय [p= 232] : Nom. P. एलायति, to be wanton or playful, be merry. [L=40174] ईल् [p= 170] : Caus. P. ईलयति, to move TS. vi, 4, 2, 6 (cf. ईर्, Caus.) [L=29820]

SCH: īlay [L=7805] [p= 107-2], īláyati sich bewegen , TS. 6 , 4 , 2 , 6. Vgl. Kaus. von īr. -- Auch: von der Stelle bewegen , Āpast. Śr. 1 , 16 , 11. 4

zaaf2 commented 8 years ago

Insufficient elements to reach a conclusion.

PWG: image

MW: उत्पला-वती [p= 181] : f. N. of a river MBh. [L=31710]; f. of an अप्सरस्. [L=31711] PW: उत्पलावती [L=18409] [p= 1225-1] f. N.pr. eines Flusses Mbh.6,342. = ताम्रपर्णी Gal.

zaaf2 commented 8 years ago

Factual error. Śaṅkara’s work is called उपदेशसाहस्री

PWG: image

MW: उप-देश-साहस्री [p= 199] : f. N. of certain works. [L=34706] साहस्र [p= 1212] : mf(ई, or आ)n. (fr. सहस्र) relating or belonging to a thousand, consisting of or bought with or paid for a thousand, thousand fold, exceedingly numerous, infinite VS. &c [L=243642]

VCP : image

zaaf2 commented 8 years ago

Acceptable alternative forms.

PWG:

image

MW: उष्मन् [p= 220] : m. heat, ardour, steam Mn. MBh. Suṡr. &c (in many cases, where the initial उ is combined with a preceding अ, not to be distinguished from ऊष्मन् q.v.) [L=37852] ऊष्मन् [p= 223] : m. ( √उष् cf. उष्मन्), heat, glow, ardour, hot vapour, steam, vapour AV. vi, 18, 3 VS. ṠBr. KātyṠr. BhP. (also figuratively said of passion or of money &c ) [L=38352]

zaaf2 commented 8 years ago

Factual error. Already corrected in PWG in „Nachträge“ (vol. 7): एकव्यवहारिक [L=119212] [p= 7-1722] (Nachträge), wohl °व्यावहारिक zu verbessern.

zaaf2 commented 8 years ago

OCR error.

image

zaaf2 commented 8 years ago

Factual error. The change should include ऐन्द्रावरुण and ऐन्द्रावारुण (both forms incorrectly mentioned in PWG with first ă instead of ā)

PWG: ऐन्द्रवरुण [L=69068] [p= 5-1223] adj. zu Indra und Varuṇa in Beziehung stehend Ait. [Page05.1224] Br. 6, 14. 25. °वारुण Pańḱav. Br. 8, 8, 6.

MW: ऐन्द्रावरुण [p= 234] : mfn. relating to इन्द्र and वरुण AitBr. Vait. [L=40471] ऐन्द्रावारुण [p= 234] : mfn. = ऐन्द्रावरुण above TāṇḍyaBr. [L=40473]

The Tandya Brahmana 8.8.6 has ऐन्द्रावारुण : image

zaaf2 commented 8 years ago

Factual error.

PWG: image

MW: कन्य-कुमारी [p= 249] : f. N. of दुर्गा TĀr. [L=43075] कन्या-कुमारी [p= 249] : f. = कन्य-कु° कुमारि [p= 292] : (shortened for °री q.v. ; cf. Pāṇ. 6-3, 63) कुमारी a [p= 292] : f. a young girl, one from ten to twelve years old, maiden, daughter AV. AitBr. &c [L=52291]

Taittiriya Aranyaka 10.1.7: image

zaaf2 commented 8 years ago

False positive. केशर and केसर are alternative forms of the same word (cf. case 7).

PWG: image

MW: केशर [p= 310] : &c » केसर. [L=56028]

zaaf2 commented 8 years ago

Insufficient elements to reach a conclusion. PWG: image

PW: image The mentioned work is in a manuscript edition: image

MW: कुडूहुञ्ची [p= 289] : f. (a Mahratti N. of) Solanum trilobatum Npr. [L=51760]

(Npr. = निघण्टुप्रकाश)

zaaf2 commented 8 years ago

OCR error

image

zaaf2 commented 8 years ago

Factual error. Although the form क्रोलायन is found in a manuscript (MS., v. MW), it is ungrammatical. The secondary suffix आयन, forming patronymics, requires vṛddhi-strengthening of the first syllable (cf. Whitney, Sanskrit Grammar, 1219).
PWG: image

PW: image

MW: क्रौलायन a [p= 323] : m. patr. fr. क्रोल (for °ड) Pravar. (क्रोल्° MS.) [L=58522]

zaaf2 commented 8 years ago

OCR error. image

zaaf2 commented 8 years ago

False positive. Different words.

PWG: image चतुस्तन [L=24654] [p= 2-0935] (चतुर् + स्तन) adj. f. vierzitzig: गौः Çat. Br. 6, 5, 2, 18. स्तन 1) die weibliche Brust, Zitze (bei Menschen und Thieren)

MW: चतुः-स्थान [p= 384] : » चतु-स्°. [L=71129] चतु-स्थान [p= 383] : mfn. having a fourfold basis Nār. i, 8. [L=71082] स्तन [p= 1257] : m. (…) the female breast (either human or animal) , teat, dug, udder RV. &c [L=254308]

gasyoun commented 8 years ago

@Shalu411 ever heard such Marathi word as in 17.?

zaaf2 commented 8 years ago

Regarding case 19 (क्रोलायन → क्रौलायन), now I think it is better to preserve the reading क्रोलायन. It is an attested form, mentioned as such by MW and PW. I think it is important to preserve as much as possible the correspondence between the digital and the printed version, which should be treated as a historical document, with its imperfections and all.

zaaf2 commented 8 years ago

Acceptable alternative form. image

MW: उषण [p= 220] : n. black pepper [L=37701]; n. the root of Piper Longum [L=37702] ऊषण [p= 223] : n. black pepper Suṡr. [L=38346]

zaaf2 commented 8 years ago

Alternative form, mentioned as such by PWG: image (I think each of these forms should be accessible as headwords)

MW: चिलिचिम [p= 399] : m. a kind of fish Car. i, 25 Suṡr. i, 20, 3 and 8. [L=74362] चिलिची°मि [p= 399] : m. id. L. Sch. » also चिलमीलिका. [L=74364]

zaaf2 commented 8 years ago

False positive. A form reported as incorrect by PWG itself: image

BEN: jAmbunadamaya [L=5388] [p= 0330-b] and jAmbUnadamaya [Page0331-a+ 39]; jâmbu¤10nada + maya, adj., f. yî, Golden, Pańch. 175, 8.

gasyoun commented 8 years ago

24 - not actually false positive. I would make another list of fehlerhaft words - words that are known as bad words by the dictionary makers as well. What do you think? The search fehlerhaft fur in the downloaded dictionary file would give hundreds of cases.

zaaf2 commented 8 years ago

Not a false positive, I agree, but it should be preserved anyway. If the author decided to register such incorrect forms, it is because he probably found them in one or more texts. For a researcher or reader who encounters these forms in those texts, the information that they are wrong, and the reference to the correct form, may be very useful. As regards the list you propose, I think it would be a good idea to exclude automatically the words marked as fehlerhaft (in PWG etc.) or w.r. (in MW) from the lists generated by the o_vs_O method, and to transfer them to another list of words with a lower probability of error.

gasyoun commented 8 years ago

information that they are wrong, and the reference to the correct form, may be very useful - sure. There are not that many fehlerhaft (in PWG etc.) or w.r. (in MW) words in these lists, I hope. Because the only person who could integrate is @drdhaval2785 and he is busy now. Otherwise agree, that's logical.

zaaf2 commented 8 years ago

Factual error.

image

Words formed with the suffix -ika require vṛddhi-strengthening of the initial syllable. Cf. Whitney’s Sanskrit Grammar 1204 and 1222 j: image

There is a typo. error in the MW entry: image ज्येस्क्ठ-सामन् should be changed to ज्येष्ठ-सामन् image

For anyone interested in finding the word ज्यैष्ठसामिक in the cited work goBila-SrAdDa-kalpa (I was unable), I think it is contained in this book here.

funderburkjim commented 8 years ago

@zaaf2 Yes, Devanagari is ok.

funderburkjim commented 8 years ago

Re: 25. ज्येष्ठसामिक → ज्यैष्ठसामिक

What do you make of the fact that PWG has both ज्येष्ठसामन् and ज्यैष्ठसामन् and calls the latter a false reading?

्येष्ठसामन् [p= 3-0159] : (ज्येष्ठ + सामन्) — 1) n. N. eines best. Sâman Gobh. 3, 2, 41. Ind. St. 3. 205. ज्येष्ठसाम्ना च देवेशं जगौ नारायणः Mbh. 13, 876. ज्येष्ठसामग M. 3, 185. ज्येष्ठसामव्रतो हरिः Mbh. 12, 13593. ज्येष्ठसामाज्यदोह Ind. st. 3, 217.
— 2) adj. der dieses Sâman singt Jâǵń. 1, 219. [L=27962] [p= 5-1451] : — 1) Pańḱav. Br. 21, 2, 3. [L=74575]

ज्यैष्ठसामन् [p= 3-0159] : in Çkdr. und bei Wils. falsche Form für ज्येष्ठ°. [L=27975]

A theory: the suffix 'ka' applies to sAman, yielding sAmika (the A is already vfdDi); so the compound is jyezWa + sAmika सामिक [p= 7-0937] : — 1) adj. von सामन् Gesang Lâṭj. 7, 9, 7.
— 2) m. Baum (!) H. ç. 172.
— Vgl. सामक. [L=108511]

gasyoun commented 8 years ago

@funderburkjim I guess @zaaf2 means and I agree with him in that that false readings, that are old enough to be known to PWG, should be left as such and marked in a separate list. Never should they be corrected, but, eventually, could have a link back to the good form. As the articles are not interlinked now, not sure if ever possible.

Not all errors (=fehlerhaft) known to Boethlingk in PWG are equal. There are 2075 lines than contain some form of the word fehlerhaft (without endings) [1]. But not about all of them B. is sure, when unsure, he adds wohl, so it becomes wohl fehlerhaft [2]. But even fehlerhaft in an entry does not means the whole entry is gone with the winds. It could be related to 1 out of 10 quotes and not the general headword [4]. The simples cases are when the entry is short, not quotes and it's easy to see it's headword-only related [3]. In these cases the headword might have been included just to cross-link.

1) 100% sure - fehlerhaft für

2) 50% sure [?] - wohl fehlerhaft, vielleicht fehlerhaft:

3) In every text, general:

4) In a seperate text, concrete:

5) False positives, that contain fehlerhaft, but should not be excluded in any way.

@zaaf2 agree with such classification or think there is no need in it?

gasyoun commented 8 years ago

https://raw.githubusercontent.com/sanskrit-lexicon/CORRECTIONS/master/PWG-2075-fehlerhaft.xml - extracted full list of fehlerhaft. In several ways it is similar to an older list https://github.com/sanskrit-lexicon/CORRECTIONS/blob/master/PWK-zu-lesen-325.txt

zaaf2 commented 8 years ago

@gasyoun The classification is excellent. A look at the list you generated shows that the most relevant cases (i.e., those to be included in category 1. - 100% sure) include not only fehlerhaft für but also fehlerhafte Schreibart and fehlerhafte Variante (or Var.), and perhaps other such designation. If those cases be excluded from the list generated by the o_vs_O method we will get a list with a higher probability of errors to be corrected. For example, the PWG list includes 3103 cases, which is a humanly unworkable number. Of course the most relevant are those 182 cases in which one dictionary disagrees with two or more others. Anyway, it would be desirable to narrow as much as possible the total number of case, if the rest of the list is to have any sense.

Other suggestions are:

zaaf2 commented 8 years ago

@funderburkjim I think the fact that PWG calls ज्यैष्ठसामन् a false reading only confirms that the right reading is ज्यैष्ठसामिक. The wrong form ज्यैष्ठसामन् can most probably be explained as a retro-derivation from the adjective ज्यैष्ठसामिक, and it is wrong because in this case the vṛddhi-strengthening is unjustified. This proves that the strengthening was already there, in the adjective. As seen in case 1 of issue #127 the syllable subjected to vṛddhi can be the first syllable of a compound (v. Whitney’s Sanskrit Grammar 1204)

It is possible that the vṛddhi should be conceived as already included in the ā of -सामन्, but I have a feeling that this is improbable, as the resulting form would lose its clear derivative aspect. Of course the definitive proof could only be provided by the mentioned passage.

zaaf2 commented 8 years ago

No change. The form ज्वाला- is marked as a possible one by PWG. image

The mentioned work is in a manuscript edition: image

zaaf2 commented 8 years ago

Factual error.

PWG: त्रैयरुण [L=31311] [p= 3-0454] und त्रैयरुणि s. u. त्रय्यरुण.

The PGW section „Verbesserungen und Nachträge“ (vol. 5) has: त्रैयरुण [L=75385] [p= 5-1481] lies त्रय्यारुण st. त्रय्यरुण

The question is how to correct this error: (a) to preserve त्रैयरुण (and त्रैयरुणि) and to change only त्रय्यरुण to त्रय्यारुण; or, considering that the word comes from आरुण, (b) to change the whole entry as follows: त्रैयारुण [L=31311] [p= 3-0454] und त्रैयारुणि s. u. त्रय्यारुण. I am inclined to this last solution.

Cf. MW: त्रय्यारुण [p= 457] : m. (for त्र्य्-आरुण) N. of a prince (…) आरुण [p= 150] : mf(ई)n. coming from or belonging to अरुण [L=26246]

zaaf2 commented 8 years ago

Factual error, already mentioned in the PWG section „Verbesserungen und Nachträge“ (vol. 5): दर्शनवरणीय [L=75536] [p= 5-1487] lies दर्शना° und vgl. दर्शनावरण Wilson, Sel. Works 1, 317. 310 (hier fälschlich दर्शनावसान). Sarvadarçanas. 38.

zaaf2 commented 8 years ago

29. देवावस → देवावास OCR error image

30. देवावस ― दैववश No change. Different words. (देवावस v. case 29)

31. द्वदशमहावाक्य → द्वादशमहावाक्य OCR error. image

32. धूलिकदम्ब → धूलीकदम्ब OCR error. image

33. नेशिक → नैशिक OCR error (due to the poor quality of the printed text). image

34. पत्त्रशिरा ― पत्त्रसिरा No change. Alternative forms. image ― Cf. MW: सिरा [p= 1217] : f. (fr. √ सृ) a stream, water RV. i, 121 (cf. Naigh. i, 12 ; often written शिरा) [L=244948]

gasyoun commented 8 years ago

@zaaf2 1. appears inside one of the entries - tough one. Wonder if Jim will like the idea, I guess not, because it involves some artificial intelligence if you ask me. The result might be not as fruitful as the work involved.

  1. To exclude and insert in a separate list the headwords which appears in two different entries in PWG - makes sense, but without Jim's Pythons impossible to realise. Easy one.

As per 27 - let Jim decide. I worry only about headwords and a few tags. Because the rest is just too huge.

zaaf2 commented 8 years ago

35. पराक्रमकेशरिन् ― पराक्रमकेसरिन् No change. केशरिन् and केसरिन् are alternative forms of the same word (cf. case 7). PWG: पराक्रमकेशरिन् (प° + के°) m. N. pr. eines Prinzen, eines Sohnes des Vikramakeçarin, Vet. in Verz. d. Oxf. H. 152,b,14. MW: परा-क्रम-केसरिन् [p= 589] : m. N. of a prince (son of विक्रम-केसरिन्) Vet. [L=116760] केशरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56139]; m. (ई) a lion MBh. Suṡr. Bhartṛ. &c [L=56140] केसरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56151] m. a lion MBh. Suṡr. Bhartṛ. &c [L=56152]

36. पुरुषकेशरिन् ― पुरुषकेसरिन् No change. केशरिन् and केसरिन् are alternative forms of the same word (cf. case 7 and 35).

37. पुरुषघ्न ― पूरुषघ्न No change. Alternative form (metri causā) PWG: पुरुषघ्न [L=46166] [p= 4-0796] (पु° + घ्न) adj. Leute treffend, - tödtend Ṛv. 1, 114, 10. स्त्री पुरुषघ्नी eine Frau, die ihren Mann getödtet hat, Jâǵń. 2, 278. पूरुष [L=46685] [p= 4-0837] s. पुरुष. MW: पूरुष-घ्न [p= 643] : mfn. slaying men RV. [L=127810] पूरुष [p= 643] : m. (mc.) = पुरुष RV. &c &c [L=127809]

gasyoun commented 8 years ago

27 not just No change. The ? should remain with the question mark. The ? = German wohl. Agree?

zaaf2 commented 8 years ago

@gasyoun I suppose you are referring to case 26 (ज्वलारासभकामय ― ज्वालारासभकामय). What I mean is that the PWG author considered the possibility of the word being written as composed with ज्वाला. He is probably questioning the form he found in the manuscript (ज्वला-) or perhaps the manuscript didn’t give a clear reading. I think in this case a change is not justified.

zaaf2 commented 8 years ago

38. पूर्वभाद्रपदा ― पूर्वाभाद्रपदा No change. Alternative form (v.l. = varia lectio) PWG: पूर्वभाद्रपदा [L=46835] [p= 4-0848] (पूर्व + भा°) f. N. des 25ten Nakshatra H. 115, v. l. °योगे Mbh. 13, 3282. Vp. 226, N. 21. °पद Colebr. Misc. Ess. Ii, 343. MW: पूर्वा-भाद्रपदा [p= 644] : f. the 25th नक्षत्र MBh. (v.l. पूर्व-भ्°). [L=128199] पूर्व-भाद्रपद [p= 644] : m. (and f(आ). pl.) the 25th नक्षत्र, the former of the two called भाद्रपदा (containing two stars) MBh. VP. Col. [L=128043]

39. बाध्योगायन ― बाध्यौगायन No change. In the PWG section Vebesserungen und Nachträge zum ganzen Werke (vol 7.) the form बाध्यौगायन was considered as a possible correction to the reading बाध्योगायन. PWG: बाध्योगायन [L=52413] [p= 5-0068] m. patron. von बाध्योग gaṇa हरितादि zu P. 4, 1, 100. बाध्योगायन [L=121273] [p= 7-1781] nach gaṇa अनुशतिकादि zu P. 7, 3, 20 könnte man बाध्यौ° erwarten.

40. बावाशस्त्रिन् → बावाशास्त्रिन् Factual error. image शास्त्रिन् gives the only sense adapted to the character of an author. बावा-शास्त्रिन् is the name of someone called also बावा-देव, so that the second part of the name is almost an epithet and has to be adapted to the character of the man, here a learned man, not a warrior (शस्त्रिन् 2, having weapons). शस्त्रिन् in the sense of a reciter is a rare one. MW: बावा-शास्त्रिन् [p= 729] : and बावा-देव m. N. of authors Cat. [L=144827.01] शास्त्रिन् [p= 1069] : mfn. or m. versed in the शास्त्रs, learned (cf. सतत-श्°) Cat. [L=216457]; m. a teacher of sacred books or science, a learned man W. [L=216458] शस्त्रिन् 1 [p= 1044] : mfn. (for 2. » [p= 1061,2]) reciting, a reciter ĀpṠr. Sch. [L=210890] शस्त्रिन् 2 [p= 1061] : mfn. having weapons, bearing arms, armed with a sword MBh. Hariv. Kām. &c [L=214641]

gasyoun commented 8 years ago

39 a possible correction should be marked as [?] and not just ignored. @funderburkjim what do you think?

zaaf2 commented 8 years ago

41. भ्रष्ट्रज → भ्राष्ट्रज OCR error. image

42. मसीधनी ― मसीधानी No change. Attested incorrect form (मसी, incorrectly for मषी, MW). PWG: image SHS: मसीधानी [L=30820] [p= 555-b] f. (-नी) An inkstand. E. मसी ink, धानी what holds. MW: मसि [p= 794] : and मसी, incorrectly for मषि and मषी q.v. (मसी- √भू, to become black Ṡiṡ. xx, 63 ; cf. मषी-भावुक) [L=159098] मषी-धनी [p= 793] : f. an ink-stand L. [L=159047] मषि-धान [p= 793] : n. an ink-stand L. [L=159036]

43. महारोद्र → महारौद्र OCR error. image

44. माधवश्रम → माधवाश्रम OCR error. image

45. रक्तकेशर ― रक्तकेसर No change. केशर and केसर are alternative forms of the same word (cf. cases 7 and 16). MW: केशर [p= 310] : &c » केसर. [L=56028]

zaaf2 commented 8 years ago

46. लक्षणवादरहस्य ― लक्षणावादरहस्य Insufficient elements to reach a conclusion. PWG image PW: image ACC: image MW: लक्षण [p= 892] : mfn. indicating, expressing indirectly Vedântas. [L=180381] (…); n. (ifc. f(आ).) a mark, sign, symbol, token, characteristic, attribute, quality (…); n. a lucky mark, favourable sign (…); n. accurate description, definition, illustration (…) लक्षणा b [p= 892] : f. aiming at, aim, object, view Hariv. [L=180434]; f. indication, elliptical expression, use of a word for another word with a cognate meaning (as of " head " for " intellect "), indirect or figurative sense of a word (…)