sanskrit-lexicon / CORRECTIONS

Correction history for Cologne Sanskrit Lexicon
8 stars 5 forks source link

Misc. corrections #298

Closed funderburkjim closed 8 years ago

funderburkjim commented 8 years ago

Corrections and/or issues noticed recently.

mw72:aShezas,7238:aSezas:t:
ap:uddac,8980:udfc:t: dda/df
ap90:vizvadyac,26668:vizvadryac:p: The ending ligature appears to be drya
pw:lakac,95266:lakaca:t: no virAma on 'c'
mw:upaDOkita,34614:upaQOkita:t: D/Q
mw:upaBrita,35263:upaBfta:t: ri/f
mw:upaBrita,35264:upaBfta:t: ri/f
mw:upayazwri,:upayazwf:p: ri/f. Text missing dot under 'r'. cf mw72, ap, etc.
mw:arcisArcizmat,15734:arcizmat:t: 
mw:arcisArcizmat,15735:arcizmat:t: 
mw:arcisArcizmatI,15736:arcizmatI:t: 
mw:ambuDakAminI,14508.1:ambu--DakAminI,ambu-Da-kAminI:t: refactor key2
mw:amBaHsyAmAka,14574:amBaHSyAmAka:t: s/S
mw:bEjavApayana,146456:bEjavApAyana:t: pay/pAy
funderburkjim commented 8 years ago
; Notes on some dualplural forms in MW:
; aDyayanatapasI:  n.du. of aDyayanatapas
funderburkjim commented 8 years ago

See below for resolution

; notes on likely HxA that need to be HxB in MW
;1A,alpena,16828 has different key1 than previous:alpa
;2A,utkzepO,30980 has different key1 than previous:utkzepa
;1A,upapakzO,35046 has different key1 than previous:upapakza
;1A,kAkutsTO,47217 has different key1 than previous:kAkutsTa
;1A,kuSIlavO,53471 has different key1 than previous:kuSIlava
;3A,kfcCrAtikfcCrO,54760 has different key1 than previous:kfcCrAtikfcCra
;2A,kruYcO,58224 has different key1 than previous:kruYca
;2A,bEjavApayana,146456 has different key1 than previous:bEjavApAyana
;1A,samyak,237309 has different key1 than previous:samyaYc
;1A,samyak,237310 has different key1 than previous:samyaYc
;1A,samyak,237311 has different key1 than previous:samyaYc
;1A,samyak,237312 has different key1 than previous:samyaYc
funderburkjim commented 8 years ago
mw:parAgagin,116779:parAgin:t:  key2 should be parA<sr/>gin
mw:parAgagata,116781:parAgata:t: key2  parA<sr/>gata
mw:parAgagantf,116782:parAgantf:t: key2 parA<sr/>gantf
mw:parAgagama,116783:parAgama:t: key2 parA<sr/>gama
; these are probably the tip of an iceberg.  The last three in particular are cases where
; (a) there is a prefixed verb, classed as an H1 or an H2
; (b) there is a block of H3 verbal derivatives WHICH MW CODES IN A VERY SIMILAR WAY TO
;       a block of compounds
; (c) the digitization incorrectly infers the full form of the compound
;      e.g. from parAgam and gantf the wrong inference is parAga-gantf

image

gasyoun commented 8 years ago

the tip of an iceberg

Any chance to batch weed them out?

funderburkjim commented 8 years ago

I'm attempting that now.

drdhaval2785 commented 8 years ago

Very interesting and long awaited corrections of compounds. I am sure, refinent in this topic would need some grammatical algorithm. Will chip in at that juncture for auto suggestion algorithms.

Best wishes

funderburkjim commented 8 years ago

An identification of errors like those under prefixed root parigam has now been done. Fortunately, the number of similar errors is much smaller than anticipated. Here they are:

;1  117828  parinirvA   parinirvA   VERB:K ***
mw:parivARa,117829:parinirvARa:t: key1 error
mw:parivApayitavya,117833:parinirvApayitavya:t: key1 error
mw:parivAyin,117834:parinirvAyin:t: key1 error
;1  118977  parisaMcakz parisaMcakz VERB:K 
mw:paricakzya,118978:parisaMcakzya:t: key1 error
;1  118983  parisaMtap  parisaMtap  VERB:K
mw:paritapta,118984:parisaMtapta:t: key1 error.   ALSO remove HOm; also remove hom on 117633
mw:pratyupaBupaBoga,134403:pratyupaBoga:t: key1 error
funderburkjim commented 8 years ago

The working documents for this preverb analysis is here.

funderburkjim commented 8 years ago

While looking for cases like parigam, I also noticed a couple of other kinds of cases that seemed like markup errors . These were of only two kinds:

image

The corrections of this type were noticed in a random manner, and are definitely incomplete -- there are numerous other similar corrections that should be made. It just requires an effort to examine the cases.

However, this distinction (between a '-' and a '~' ) DOES have signifcance in MW, as it indicates a different relation between the parent and child. My current working principle is that:

image

funderburkjim commented 8 years ago

The main purpose of the MWderivation repository mentioned above is to use the information of the H-hierarchy and the key2 markup to 'explain' all the headwords of MW. In its current incarnation, all but about 10,000 of the 200,000+ headwords are explained. The application of more Sanskrit knowledge would likely reduce this number, as well as improve the derivations of some words whose current explanations might be suspect.

The analysis2.txt file shows the current state of this MWderivation work.

funderburkjim commented 8 years ago

vcp:anukrISa,1804:anukroSa:t: I/o

gasyoun commented 8 years ago

10,000 of the 200,000+ headwords are explained. The application of more Sanskrit knowledge

Hint, hint - Dhaval is missing here.

funderburkjim commented 8 years ago

In the course of #296 investigation, noticed a possible mis-spelling in other dictionaries:

; ziNga:ziNga:BUR,PW,SCH,SHS;ziNgaH:SKD
 PW, SCH, SHS, SKD should prob. all be ziqga (libertine)

Since N and q in Devanagari are quite similar (differing only by an extra 'dot' in N), it MAY be that this word is mistyped in PW, SCH,SHS,SKD, as it was in BUR. Will examine later.

funderburkjim commented 8 years ago

June 6, 2016. These corrected.

in MW under piNga, there are errors in gender for the feminine piNgA, for L=123567 - 123573: All of these should be piNgA: f..

(H2B) pi/NgA : f. a bow-string RV.  viii, 58, 9 (Sāy.  ; cf. piNgala-jya) [L=123566]
(H2B) piNga : m. a kind of yellow pigment (cf. go-rocanA) [L=123567]
m. the stalk of Ferula Asa Foetida L.  [L=123568]
m. turmeric, Indian saffron L.  [L=123569]
m. bamboo manna W.  [L=123570]
m. N. of a woman MBh.  [L=123571]
m. of durgA W.  [L=123572]
m. a tubular vessel of the human body which according to the yoga system is the channel of respiration and circulation for one side ib. [L=123573]
funderburkjim commented 8 years ago

Here is a question about German (old and new), coming from a user-submitted correction for Grassman. Under headword paYcadaSa, the user said that funfzehn needs to be changed to fünfzehn. And Google translate also says the spelling is u-umlaut. However, Grassman print shows several instances without the u-umlaut under these words paYcadaSa, paYcadaSan, and paYcASat.

Incidentally, in the totality of our digitization, these are the only places were 'funf' appears. By contrast, fünf appears 32 times, though never in form fünfzehn.

Based on this, I suspect the 'funf' form should be considered a print error, and be changed to fünf .

The question is - is this to be treated as a printing error , or is it some variant old German spelling that should remain unchanged in our digitization?

image

gasyoun commented 8 years ago

Based on this, I suspect the 'funf' form should be considered a print error, and be changed to fünf .

Agree.

The question is - is this to be treated as a printing error , or is it some variant old German spelling that should remain unchanged in our digitization?

Not aware of such (there are too many, actually, to note them), but this is not one of them for sure. Add u-umlaut.

drdhaval2785 commented 8 years ago

Hint, hint - Dhaval is missing here.

I need to understand what is the left over and what is the methodology to help @funderburkjim . Readme needs to be explained in short, brief points.

funderburkjim commented 8 years ago

@drdhaval2785 I'll get to that brief readme in the not too distant future.

funderburkjim commented 8 years ago

@gasyoun Thanks for input re German - will add the umlauts.

funderburkjim commented 8 years ago

@drdhaval2785 A user submitted a correction to PWG, under headword anyatra, section 7) c); col 2, top of this page image

His point was that c) should say 'abl.' rather than 'instr.'

This seems right to me, as the example shows 'manuzAt', which is abl.

However, it is confusing that under (d) another ablative example appears.

Do you agree that c) should be 'abl', and is thus a print error ?

funderburkjim commented 8 years ago

Found a situation under 'jaya' similar to that under piNga above - the entries after the first feminine form jayA were incorrectly noted as jaya, m. Corrected.

drdhaval2785 commented 8 years ago

It doesnt seem to be an error to me. The word to look for is 'anyatra' and not 'mAnuza'. anyatra doesnt take case suffices. It has to be conjured from context. Context seems suitable to me.

gasyoun commented 8 years ago

anyatra doesnt take case suffices. It has to be conjured from context.

Why without?

funderburkjim commented 8 years ago

@drdhaval2785 Re anyatra - Agree that anyatra is indeclinable.

I was thinking that the 'als instr.' meant 'with (another word) in instrumental', and that the 'als abl.' meant 'with (another word) in ablative', and so forth.

However, I think you are saying that 'als. instr.' means 'when this word (anyatra) is used in an instrumental sense' , etc.

The 'als. instr.' example has:

At least based on this indirect translation, the instrumental case phrase is 'by Nobody' which I take to refer to 'tasya na anyatra' in the Sanskrit.

So, this confirms @drdhaval2785 's interpretation.

Another confirmation is in the description of AP, where, among other uses, the authors say that '[anyatra is often used] with the force of the nom. case'.

This is the only indeclineable word I recall which has this property of being used as if it is one of various cases. I wonder if there are other such words (is there an anyatrAdi list?) or if it is somewhat unique.

funderburkjim commented 8 years ago

Examination of ziNga as mentioned above. Only Sch shows ziNga. Others are typos

pw:ziNga,116428:ziqga:t:  Author states that KiNga. Kiqga are 'vgl.' (variant reading?)
sch:ziNga,25483:ziNga:n:  Author mentions that pw has ziqga
shs:ziNga,41465:ziqga:t:  confirm by alph. order
skd:ziNgaH,37026:ziqgaH:t:  confirm by alph. order
funderburkjim commented 8 years ago

Resolution of the cases above alpena thru saMyak:

mw:alpena,16828:H1A>:H1C>:t: markup change
mw:utkzepO,30980:H2A>:H2B>:t: markup change (dual)
mw:upapakzO,35046:H1A>:H1B>:t: markup change (dual)
mw:kAkutsTO,47217:H1A>:H1B>:t: markup change (dual)
mw:kuSIlavO,53471:H1A>:H1B>:t: markup change (dual)
mw:kfcCrAtikfcCrO,54760:H3A>:H3B>:t: markup change (dual)
mw:kruYcO,58224:H2A>:H2B>:t: markup change (dual)
mw:bEjavApayana,146456:bEjavApAyanat: pay/pAy
mw:samyak,237309:H1A>:H1C>t: markup change
mw:samyak,237310:H1A>:H1C>t: markup change
mw:samyak,237311:H1A>:H1C>t: markup change
mw:samyak,237312:H1A>:H1C>t: markup change
funderburkjim commented 8 years ago

These corrections have now been installed:

       #t  #p  #n
ap90    1   0   0
ap      1   0   0
mw     30   1   0
mw72    1   0   0
pw      2   0   0
sch     0   0   1
shs     1   0   0
skd     1   0   0
gasyoun commented 8 years ago

Ger. als = Eng. as (= in sense).