Closed zaaf2 closed 8 years ago
False positive. PWG has:
(…)
असूक in AP has another origin and meaning: “असूक a. See असूयक.”
Factual error:
As mentioned by PW, the Tandya Brahmana 21,2,5 has आच्यादोह :
According to MW आच्या in आच्यादोह (with ā) comes from the Vedic ind. p. (aka gerund) of आच् (< आ-√अच्), instead of the regular ind. p. with ă आच्य. Here the MW screenshot:
Regarding the Vedic ind.p., v. MacDonell, A Vedic Grammar for Students:
Factual error. As mentioned by PW, the Tandya Brahmana 12,11,15 has आतीषादीय :
Factual error, corrected by PWG itself in the section „Verbesserungen und Nachträge“ (vol. 5):
आषाडी [L=9704] [p= 1-0728] (°ढी?) f. N. pr. einer Localität R. 4, 27, 11.
आषाडी [L=67037] [p= 5-1128] zu streichen, da an der angeführten Stelle आषाढी in der gangbaren Bed. zu lesen ist.
@funderburkjim is Devanagari OK with you? @zaaf2 amazing work! Wonder how many thousands of mistakes are already documented in the „Verbesserungen und Nachträge“ part. Similar work was integrated only in MW. Never in PWG or any other.
Good to see that the raw data is now put to analysis and corrections are pouring in. Good work.
@drdhaval2785 I should have mentioned in the opening of this issue that its object is an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html, generated by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG. I will edit now the opening of this issue to correct this.
False positive.
Just another way to write इङ्कार, as in MW:
इङ्कार [p= 164] : and इङ्-कृत = हिङ्-कार, हिङ्कृत q.v. [L=28636]
False positive.
केशरिन् and केसरिन् are alternative forms of the same word. Cf. MW:
केशरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56139]; m. (ई) a lion MBh. Suṡr. Bhartṛ. &c [L=56140]
केसरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56151] m. a lion MBh. Suṡr. Bhartṛ. &c [L=56152]
False positive
PWG:
MW:
एलाय [p= 232] : Nom. P. एलायति, to be wanton or playful, be merry. [L=40174]
ईल् [p= 170] : Caus. P. ईलयति, to move TS. vi, 4, 2, 6 (cf. ईर्, Caus.) [L=29820]
SCH:
īlay [L=7805] [p= 107-2], īláyati sich bewegen , TS. 6 , 4 , 2 , 6. Vgl. Kaus. von īr. -- Auch: von der Stelle bewegen , Āpast. Śr. 1 , 16 , 11. 4
Insufficient elements to reach a conclusion.
PWG:
MW:
उत्पला-वती [p= 181] : f. N. of a river MBh. [L=31710]; f. of an अप्सरस्. [L=31711]
PW:
उत्पलावती [L=18409] [p= 1225-1] f. N.pr. eines Flusses Mbh.6,342. = ताम्रपर्णी Gal.
Factual error. Śaṅkara’s work is called उपदेशसाहस्री
PWG:
MW:
उप-देश-साहस्री [p= 199] : f. N. of certain works. [L=34706]
साहस्र [p= 1212] : mf(ई, or आ)n. (fr. सहस्र) relating or belonging to a thousand, consisting of or bought with or paid for a thousand, thousand fold, exceedingly numerous, infinite VS. &c [L=243642]
VCP :
Acceptable alternative forms.
PWG:
MW:
उष्मन् [p= 220] : m. heat, ardour, steam Mn. MBh. Suṡr. &c (in many cases, where the initial उ is combined with a preceding अ, not to be distinguished from ऊष्मन् q.v.) [L=37852]
ऊष्मन् [p= 223] : m. ( √उष् cf. उष्मन्), heat, glow, ardour, hot vapour, steam, vapour AV. vi, 18, 3 VS. ṠBr. KātyṠr. BhP. (also figuratively said of passion or of money &c ) [L=38352]
Factual error.
Already corrected in PWG in „Nachträge“ (vol. 7):
एकव्यवहारिक [L=119212] [p= 7-1722] (Nachträge), wohl °व्यावहारिक zu verbessern.
OCR error.
Factual error. The change should include ऐन्द्रावरुण and ऐन्द्रावारुण (both forms incorrectly mentioned in PWG with first ă instead of ā)
PWG: ऐन्द्रवरुण [L=69068] [p= 5-1223] adj. zu Indra und Varuṇa in Beziehung stehend Ait. [Page05.1224] Br. 6, 14. 25. °वारुण Pańḱav. Br. 8, 8, 6.
MW:
ऐन्द्रावरुण [p= 234] : mfn. relating to इन्द्र and वरुण AitBr. Vait. [L=40471]
ऐन्द्रावारुण [p= 234] : mfn. = ऐन्द्रावरुण above TāṇḍyaBr. [L=40473]
The Tandya Brahmana 8.8.6 has ऐन्द्रावारुण :
Factual error.
PWG:
MW:
कन्य-कुमारी [p= 249] : f. N. of दुर्गा TĀr. [L=43075]
कन्या-कुमारी [p= 249] : f. = कन्य-कु°
कुमारि [p= 292] : (shortened for °री q.v. ; cf. Pāṇ. 6-3, 63)
कुमारी a [p= 292] : f. a young girl, one from ten to twelve years old, maiden, daughter AV. AitBr. &c [L=52291]
False positive. केशर and केसर are alternative forms of the same word (cf. case 7).
PWG:
MW:
केशर [p= 310] : &c » केसर. [L=56028]
Insufficient elements to reach a conclusion. PWG:
PW: The mentioned work is in a manuscript edition:
MW:
कुडूहुञ्ची [p= 289] : f. (a Mahratti N. of) Solanum trilobatum Npr. [L=51760]
(Npr. = निघण्टुप्रकाश)
OCR error
Factual error.
Although the form क्रोलायन is found in a manuscript (MS., v. MW), it is ungrammatical. The secondary suffix आयन, forming patronymics, requires vṛddhi-strengthening of the first syllable (cf. Whitney, Sanskrit Grammar, 1219).
PWG:
PW:
MW:
क्रौलायन a [p= 323] : m. patr. fr. क्रोल (for °ड) Pravar. (क्रोल्° MS.) [L=58522]
OCR error.
False positive. Different words.
PWG:
चतुस्तन [L=24654] [p= 2-0935] (चतुर् + स्तन) adj. f. vierzitzig: गौः Çat. Br. 6, 5, 2, 18.
स्तन 1) die weibliche Brust, Zitze (bei Menschen und Thieren)
MW:
चतुः-स्थान [p= 384] : » चतु-स्°. [L=71129]
चतु-स्थान [p= 383] : mfn. having a fourfold basis Nār. i, 8. [L=71082]
स्तन [p= 1257] : m. (…) the female breast (either human or animal) , teat, dug, udder RV. &c [L=254308]
@Shalu411 ever heard such Marathi word as in 17.?
Regarding case 19 (क्रोलायन → क्रौलायन), now I think it is better to preserve the reading क्रोलायन. It is an attested form, mentioned as such by MW and PW. I think it is important to preserve as much as possible the correspondence between the digital and the printed version, which should be treated as a historical document, with its imperfections and all.
Acceptable alternative form.
MW:
उषण [p= 220] : n. black pepper [L=37701]; n. the root of Piper Longum [L=37702]
ऊषण [p= 223] : n. black pepper Suṡr. [L=38346]
Alternative form, mentioned as such by PWG: (I think each of these forms should be accessible as headwords)
MW:
चिलिचिम [p= 399] : m. a kind of fish Car. i, 25 Suṡr. i, 20, 3 and 8. [L=74362]
चिलिची°मि [p= 399] : m. id. L. Sch. » also चिलमीलिका. [L=74364]
False positive. A form reported as incorrect by PWG itself:
BEN:
jAmbunadamaya [L=5388] [p= 0330-b] and jAmbUnadamaya [Page0331-a+ 39]; jâmbu¤10nada + maya, adj., f. yî, Golden, Pańch. 175, 8.
24 - not actually false positive. I would make another list of fehlerhaft
words - words that are known as bad words by the dictionary makers as well. What do you think? The search fehlerhaft fur
in the downloaded dictionary file would give hundreds of cases.
Not a false positive, I agree, but it should be preserved anyway. If the author decided to register such incorrect forms, it is because he probably found them in one or more texts. For a researcher or reader who encounters these forms in those texts, the information that they are wrong, and the reference to the correct form, may be very useful. As regards the list you propose, I think it would be a good idea to exclude automatically the words marked as fehlerhaft (in PWG etc.) or w.r. (in MW) from the lists generated by the o_vs_O method, and to transfer them to another list of words with a lower probability of error.
information that they are wrong, and the reference to the correct form, may be very useful - sure. There are not that many fehlerhaft (in PWG etc.) or w.r. (in MW) words in these lists, I hope. Because the only person who could integrate is @drdhaval2785 and he is busy now. Otherwise agree, that's logical.
Factual error.
Words formed with the suffix -ika require vṛddhi-strengthening of the initial syllable. Cf. Whitney’s Sanskrit Grammar 1204 and 1222 j:
There is a typo. error in the MW entry: ज्येस्क्ठ-सामन् should be changed to ज्येष्ठ-सामन्
For anyone interested in finding the word ज्यैष्ठसामिक in the cited work goBila-SrAdDa-kalpa (I was unable), I think it is contained in this book here.
@zaaf2 Yes, Devanagari is ok.
Re: 25. ज्येष्ठसामिक → ज्यैष्ठसामिक
What do you make of the fact that PWG has both ज्येष्ठसामन् and ज्यैष्ठसामन् and calls the latter a false reading?
्येष्ठसामन् [p= 3-0159] : (ज्येष्ठ + सामन्)
— 1) n. N. eines best. Sâman Gobh. 3, 2, 41. Ind. St. 3. 205. ज्येष्ठसाम्ना च देवेशं जगौ नारायणः Mbh. 13, 876. ज्येष्ठसामग M. 3, 185. ज्येष्ठसामव्रतो हरिः Mbh. 12, 13593. ज्येष्ठसामाज्यदोह Ind. st. 3, 217.
— 2) adj. der dieses Sâman singt Jâǵń. 1, 219. [L=27962]
[p= 5-1451] :
— 1) Pańḱav. Br. 21, 2, 3. [L=74575]
ज्यैष्ठसामन् [p= 3-0159] : in Çkdr. und bei Wils. falsche Form für ज्येष्ठ°. [L=27975]
A theory: the suffix 'ka' applies to sAman, yielding sAmika (the A is already vfdDi); so the compound is jyezWa + sAmika
सामिक [p= 7-0937] :
— 1) adj. von सामन् Gesang Lâṭj. 7, 9, 7.
— 2) m. Baum (!) H. ç. 172.
— Vgl. सामक. [L=108511]
@funderburkjim I guess @zaaf2 means and I agree with him in that that false readings, that are old enough to be known to PWG, should be left as such and marked in a separate list. Never should they be corrected, but, eventually, could have a link back to the good form. As the articles are not interlinked now, not sure if ever possible.
Not all errors (=fehlerhaft) known to Boethlingk in PWG are equal. There are 2075 lines than contain some form of the word fehlerhaft
(without endings) [1]. But not about all of them B. is sure, when unsure, he adds wohl
, so it becomes wohl fehlerhaft
[2].
But even fehlerhaft
in an entry does not means the whole entry is gone with the winds. It could be related to 1 out of 10 quotes and not the general headword [4]. The simples cases are when the entry is short, not quotes and it's easy to see it's headword-only related [3]. In these cases the headword might have been included just to cross-link.
1) 100% sure - fehlerhaft für
2) 50% sure [?] - wohl fehlerhaft, vielleicht fehlerhaft:
3) In every text, general:
4) In a seperate text, concrete:
5) False positives, that contain fehlerhaft
, but should not be excluded in any way.
@zaaf2 agree with such classification or think there is no need in it?
https://raw.githubusercontent.com/sanskrit-lexicon/CORRECTIONS/master/PWG-2075-fehlerhaft.xml - extracted full list of fehlerhaft
.
In several ways it is similar to an older list https://github.com/sanskrit-lexicon/CORRECTIONS/blob/master/PWK-zu-lesen-325.txt
@gasyoun The classification is excellent. A look at the list you generated shows that the most relevant cases (i.e., those to be included in category 1. - 100% sure) include not only fehlerhaft für
but also fehlerhafte Schreibart
and fehlerhafte Variante (or Var.),
and perhaps other such designation. If those cases be excluded from the list generated by the o_vs_O method we will get a list with a higher probability of errors to be corrected. For example, the PWG list includes 3103 cases, which is a humanly unworkable number. Of course the most relevant are those 182 cases in which one dictionary disagrees with two or more others. Anyway, it would be desirable to narrow as much as possible the total number of case, if the rest of the list is to have any sense.
Other suggestions are:
Verbesserungen und Nachträge
(e.g. case 5). These cases could be worked directly from this errata section. @funderburkjim I think the fact that PWG calls ज्यैष्ठसामन् a false reading only confirms that the right reading is ज्यैष्ठसामिक. The wrong form ज्यैष्ठसामन् can most probably be explained as a retro-derivation from the adjective ज्यैष्ठसामिक, and it is wrong because in this case the vṛddhi-strengthening is unjustified. This proves that the strengthening was already there, in the adjective. As seen in case 1 of issue #127 the syllable subjected to vṛddhi can be the first syllable of a compound (v. Whitney’s Sanskrit Grammar 1204)
It is possible that the vṛddhi should be conceived as already included in the ā of -सामन्, but I have a feeling that this is improbable, as the resulting form would lose its clear derivative aspect. Of course the definitive proof could only be provided by the mentioned passage.
No change. The form ज्वाला- is marked as a possible one by PWG.
The mentioned work is in a manuscript edition:
Factual error.
PWG:
त्रैयरुण [L=31311] [p= 3-0454] und त्रैयरुणि s. u. त्रय्यरुण.
The PGW section „Verbesserungen und Nachträge“ (vol. 5) has:
त्रैयरुण [L=75385] [p= 5-1481] lies त्रय्यारुण st. त्रय्यरुण
The question is how to correct this error: (a) to preserve त्रैयरुण (and त्रैयरुणि) and to change only त्रय्यरुण to त्रय्यारुण; or, considering that the word comes from आरुण, (b) to change the whole entry as follows:
त्रैयारुण [L=31311] [p= 3-0454] und त्रैयारुणि s. u. त्रय्यारुण.
I am inclined to this last solution.
Cf. MW:
त्रय्यारुण [p= 457] : m. (for त्र्य्-आरुण) N. of a prince (…)
आरुण [p= 150] : mf(ई)n. coming from or belonging to अरुण [L=26246]
Factual error, already mentioned in the PWG section „Verbesserungen und Nachträge“ (vol. 5):
दर्शनवरणीय [L=75536] [p= 5-1487] lies दर्शना° und vgl. दर्शनावरण Wilson, Sel. Works 1, 317. 310 (hier fälschlich दर्शनावसान). Sarvadarçanas. 38.
29.
देवावस → देवावास
OCR error
30.
देवावस ― दैववश
No change. Different words. (देवावस v. case 29)
31.
द्वदशमहावाक्य → द्वादशमहावाक्य
OCR error.
32.
धूलिकदम्ब → धूलीकदम्ब
OCR error.
33.
नेशिक → नैशिक
OCR error (due to the poor quality of the printed text).
34.
पत्त्रशिरा ― पत्त्रसिरा
No change. Alternative forms.
― Cf. MW:
सिरा [p= 1217] : f. (fr. √ सृ) a stream, water RV. i, 121 (cf. Naigh. i, 12 ; often written शिरा) [L=244948]
@zaaf2 1. appears inside one of the entries
- tough one. Wonder if Jim will like the idea, I guess not, because it involves some artificial intelligence if you ask me. The result might be not as fruitful as the work involved.
To exclude and insert in a separate list the headwords which appears in two different entries in PWG
- makes sense, but without Jim's Pythons impossible to realise. Easy one.As per 27
- let Jim decide. I worry only about headwords and a few tags. Because the rest is just too huge.
35.
पराक्रमकेशरिन् ― पराक्रमकेसरिन्
No change. केशरिन् and केसरिन् are alternative forms of the same word (cf. case 7).
PWG:
पराक्रमकेशरिन् (प° + के°) m. N. pr. eines Prinzen, eines Sohnes des Vikramakeçarin, Vet. in Verz. d. Oxf. H. 152,b,14.
MW:
परा-क्रम-केसरिन् [p= 589] : m. N. of a prince (son of विक्रम-केसरिन्) Vet. [L=116760]
केशरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56139]; m. (ई) a lion MBh. Suṡr. Bhartṛ. &c [L=56140]
केसरिन् [p= 311] : mfn. having a mane MBh. i, iii [L=56151] m. a lion MBh. Suṡr. Bhartṛ. &c [L=56152]
36.
पुरुषकेशरिन् ― पुरुषकेसरिन्
No change. केशरिन् and केसरिन् are alternative forms of the same word (cf. case 7 and 35).
37.
पुरुषघ्न ― पूरुषघ्न
No change. Alternative form (metri causā)
PWG:
पुरुषघ्न [L=46166] [p= 4-0796] (पु° + घ्न) adj. Leute treffend, - tödtend Ṛv. 1, 114, 10. स्त्री पुरुषघ्नी eine Frau, die ihren Mann getödtet hat, Jâǵń. 2, 278.
पूरुष [L=46685] [p= 4-0837] s. पुरुष.
MW:
पूरुष-घ्न [p= 643] : mfn. slaying men RV. [L=127810]
पूरुष [p= 643] : m. (mc.) = पुरुष RV. &c &c [L=127809]
27
not just No change
. The ?
should remain with the question mark. The ?
= German wohl. Agree?
@gasyoun I suppose you are referring to case 26 (ज्वलारासभकामय ― ज्वालारासभकामय). What I mean is that the PWG author considered the possibility of the word being written as composed with ज्वाला. He is probably questioning the form he found in the manuscript (ज्वला-) or perhaps the manuscript didn’t give a clear reading. I think in this case a change is not justified.
38.
पूर्वभाद्रपदा ― पूर्वाभाद्रपदा
No change. Alternative form (v.l. = varia lectio)
PWG:
पूर्वभाद्रपदा [L=46835] [p= 4-0848] (पूर्व + भा°) f. N. des 25ten Nakshatra H. 115, v. l. °योगे Mbh. 13, 3282. Vp. 226, N. 21. °पद Colebr. Misc. Ess. Ii, 343.
MW:
पूर्वा-भाद्रपदा [p= 644] : f. the 25th नक्षत्र MBh. (v.l. पूर्व-भ्°). [L=128199]
पूर्व-भाद्रपद [p= 644] : m. (and f(आ). pl.) the 25th नक्षत्र, the former of the two called भाद्रपदा (containing two stars) MBh. VP. Col. [L=128043]
39.
बाध्योगायन ― बाध्यौगायन
No change. In the PWG section Vebesserungen und Nachträge zum ganzen Werke
(vol 7.) the form बाध्यौगायन was considered as a possible correction to the reading बाध्योगायन.
PWG:
बाध्योगायन [L=52413] [p= 5-0068] m. patron. von बाध्योग gaṇa हरितादि zu P. 4, 1, 100.
बाध्योगायन [L=121273] [p= 7-1781] nach gaṇa अनुशतिकादि zu P. 7, 3, 20 könnte man बाध्यौ° erwarten.
40.
बावाशस्त्रिन् → बावाशास्त्रिन्
Factual error.
शास्त्रिन् gives the only sense adapted to the character of an author. बावा-शास्त्रिन् is the name of someone called also बावा-देव, so that the second part of the name is almost an epithet and has to be adapted to the character of the man, here a learned man, not a warrior (शस्त्रिन् 2, having weapons
). शस्त्रिन् in the sense of a reciter is a rare one.
MW:
बावा-शास्त्रिन् [p= 729] : and बावा-देव m. N. of authors Cat. [L=144827.01]
शास्त्रिन् [p= 1069] : mfn. or m. versed in the शास्त्रs, learned (cf. सतत-श्°) Cat. [L=216457]; m. a teacher of sacred books or science, a learned man W. [L=216458]
शस्त्रिन् 1 [p= 1044] : mfn. (for 2. » [p= 1061,2]) reciting, a reciter ĀpṠr. Sch. [L=210890]
शस्त्रिन् 2 [p= 1061] : mfn. having weapons, bearing arms, armed with a sword MBh. Hariv. Kām. &c [L=214641]
39
a possible correction should be marked as [?]
and not just ignored. @funderburkjim what do you think?
41.
भ्रष्ट्रज → भ्राष्ट्रज
OCR error.
42.
मसीधनी ― मसीधानी
No change. Attested incorrect form (मसी, incorrectly for मषी, MW).
PWG:
SHS:
मसीधानी [L=30820] [p= 555-b] f. (-नी) An inkstand. E. मसी ink, धानी what holds.
MW:
मसि [p= 794] : and मसी, incorrectly for मषि and मषी q.v. (मसी- √भू, to become black Ṡiṡ. xx, 63 ; cf. मषी-भावुक) [L=159098]
मषी-धनी [p= 793] : f. an ink-stand L. [L=159047]
मषि-धान [p= 793] : n. an ink-stand L. [L=159036]
43.
महारोद्र → महारौद्र
OCR error.
44.
माधवश्रम → माधवाश्रम
OCR error.
45.
रक्तकेशर ― रक्तकेसर
No change.
केशर and केसर are alternative forms of the same word (cf. cases 7 and 16).
MW:
केशर [p= 310] : &c » केसर. [L=56028]
46.
लक्षणवादरहस्य ― लक्षणावादरहस्य
Insufficient elements to reach a conclusion.
PWG
PW:
ACC:
MW:
लक्षण [p= 892] : mfn. indicating, expressing indirectly Vedântas. [L=180381] (…); n. (ifc. f(आ).) a mark, sign, symbol, token, characteristic, attribute, quality (…); n. a lucky mark, favourable sign (…); n. accurate description, definition, illustration (…)
लक्षणा b [p= 892] : f. aiming at, aim, object, view Hariv. [L=180434]; f. indication, elliptical expression, use of a word for another word with a cognate meaning (as of " head " for " intellect "), indirect or figurative sense of a word (…)
This issue is about an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html, generated by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG.
OCR error.