Closed funderburkjim closed 7 years ago
How many corrections where approved, how quickly is the proofreading moving on? As per 1 the topic (partly) is discussed at https://groups.google.com/forum/#!topic/bvparishat/IaCEYDmLmbI As per 2 the topic is discussed at https://groups.google.com/forum/#!topic/bvparishat/Eyz0lSNDk-s What I would go for and what I have done when working on an index of all Sanskrit words from all Cologne dictionaries is that I removed all the visargas and anusvaras at the end, but "remember" where I removed them for the sake of indexing and searching. Same with the double consonants before "r".
Namaste Issue-1 : What I found on Apte was this- Where a word exists in all 3 genders, it will be left as a prAtipadika or substantive =base word. Eg. दृष्ट (dRSTa) nothing is mentioned beside the word except an "a"= adjective. We understand it as existing in all 3 genders. Whereas दृष्टिः is given with visarga and metioned with "f" because it is exclusively always feminine. (images attached of what he says about that issue in preface> Directions) May be that will help here as well. What I find in Vachaspatyam is it always mentioning base word, unlike SKD which has always first singular form. So there are different standards. For me Apte seems best. But that presupposes some Sanskrit knowledge. In looking into a real touchable book, its no issue at all - because you never bother if the word has visarga or not- untill you can find it in its alphabetical order. Its only in digital editions that it becomes an issue. I think may be.. may be- giving an option between both is better? Both ways searchable? As base word and with first form too?? Possible?
Namaste An interesting observation- in this line- http://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/2013/web/webtc1/index.php स्वय [L=41138] [p= 5-474] - स्वयं, [म्] व्य, आत्मना । There is no word as "svaya"; only "svayaM" exists in language. The removing of "M" and "ः" in the end of the words can bring in disasters like introducing non-existant, ungrammatical words (ghost words) into the vocabulary of a language. So removing of "M" and "ः" and give as bases is agreeable trend only upto nominal bases, but cannot be extended to indeclinables. So please do not standardize this method. Apte seems to be much meaningful in this regard. Thankyou
This is an interesting issue. Indeclinables - they are not so many, can we have a list of those, whom we should not touch? It's a good point, I agree, non-words (apadam) is something we would not want to have. Jim, can we make a RegEx protection for words which contain avyaya markup? I do not see no simple solution http://research.ijcaonline.org/volume38/number6/pxc3876825.pdf
Namaste The list, one can surely have. There is an avyaya kosha.. But a simpler way is to get them out of already programmed dictionaries itself by basis of some code words. Beside every avyaya word- SKD gives "व्य" ; Eg. [L=41138] [p= 5-474] स्वयं, [म्] व्य, आत्मना ।...... VCP gives "अव्य" ; Eg. [L=47570] [p= 5381] स्वयम्¦ अव्य० सु + अय--अमु । आत्मनेत्यर्थे अमरः । Apte gives "ind." Eg. स्वयम् ind. 1 Oneself, in one's own person... So if there is a way out to pull out these words, fine. Otherwise we can have a list prepared from Avyaya kosha. Its an exhaustible list, dependable.
By contrast, in SKD, in one step of the headword 'key' generation, the following simplification was done:
Remove ending 'm','M' and 'H' (for consistency with MW conventions)
In hindsight, this simplification may have been inappropriate.
Question: Should I retrofit SKD, avoiding this simplification?
Then 'agniH' and 'svayam' would be headword keys, as in AP90. (Note, the spelling in SKD is actually 'svayaM'. I think the conversion of final 'M' to 'm' remains appropriate.)
1) Words with "ind" markup would be a nice starting point. Could you please show a .txt file of them? 2) Retrofit might add even more issues, Shalu? I would love to see all kinds of lists of headwords for further decisions.
hwnorm1 is where this should be handled. From my experience it is too difficult to do this change generically. Better to normalize in a shadow file like hwnorm1c.
The corrections submitted for SKD are now processed, and it won't take so long to process further corrections. Two details arose in considering these corrections, and this 'sanskrit-lexicon/Cologne/Issues' list seems a reasonable place to mention these details.