Open funderburkjim opened 4 years ago
One is a print change -- poor quality print. Noted in krm_printchange file of csl-corrections.
I wonder how many tens of poor quality print errors run unnoticed. And even reading the dictionary by a human can not help, only similar word comparison can.
only similar word comparison can.
Agree. Many subtle errors uncovered during a process of comparing contextually similar words from different sources.
Only the 5th is classed as a print change.
; Case 001: L=271, k1=ANaH, #changes=2, #extra_withs=1
; ANaH krandassAtatye -> ANaH krandaH sAtatye
; (separate premarker from root, and root from sense.
old: <L>271<pc>0277<k1>ANaH<k2>ANaH
new: <L>271<pc>0277<k1>kranda<k2>kranda
;
old: (271) {@<s>“ANaH krandassAtatye”</s>@}¦ (X<s>-curAdiH</s>-1728. <s>aka</s>. <s>sew</s>. <s>uBa</s>.)
new: (271) {@<s>“ANaH krandaH sAtatye”</s>@}¦ (X<s>-curAdiH</s>-1728. <s>aka</s>. <s>sew</s>. <s>uBa</s>.)
; -----------------------------------------------
; Case 002: L=274, k1=qukrIY, #changes=2
; qukrIY -> qu krIY
; separate premarker from root
old: <L>274<pc>0283<k1>qukrIY<k2>qukrIY
new: <L>274<pc>0283<k1>krIY<k2>krIY
;
old: (274) {@<s>“qukrIY dravyavinimaye”</s>@}¦ (IX<s>-kryAdiH</s>-1473. <s>saka</s>. <s>ani</s>. <s>uBa</s>.)
new: (274) {@<s>“qu krIY dravyavinimaye”</s>@}¦ (IX<s>-kryAdiH</s>-1473. <s>saka</s>. <s>ani</s>. <s>uBa</s>.)
; -----------------------------------------------
; Case 003: L=416, k1=guvIM, #changes=2
; guvIM -> gurvI (r sign mistaken as anusvara)
old: <L>416<pc>0405<k1>guvIM<k2>guvIM
new: <L>416<pc>0405<k1>gurvI<k2>gurvI
;
old: (416) {@<s>“guvIM udyamane”</s>@}¦ (I<s>-BvAdiH</s>-574. <s>aka</s>. <s>sew</s>. <s>para</s>.)
new: (416) {@<s>“gurvI udyamane”</s>@}¦ (I<s>-BvAdiH</s>-574. <s>aka</s>. <s>sew</s>. <s>para</s>.)
; -----------------------------------------------
; Case 004: L=1325, k1=mrewwa, #changes=2
; mrewwa -> mrewf (wa after w should be vowel 'f'
old: <L>1325<pc>1061<k1>mrewwa<k2>mrewwa
new: <L>1325<pc>1061<k1>mrewf<k2>mrewf
;
old: (1325) {@<s>“mrewwa unmAde”</s>@}¦ (I<s>-BvAdiH-aka</s>. <s>sew</s>. <s>para</s>.)
new: (1325) {@<s>“mrewf unmAde”</s>@}¦ (I<s>-BvAdiH-aka</s>. <s>sew</s>. <s>para</s>.)
; -----------------------------------------------
; Case 005: L=1594, k1=viCa, #changes=2
; viCa -> vicCa
; Print change
old: <L>1594<pc>1226<k1>viCa<k2>viCa
new: <L>1594<pc>1226<k1>vicCa<k2>vicCa
;
old: (1595) {@<s>“viCa gatO”</s>@}¦ (VI<s>-tudAdiH</s>-1423. <s>saka</s>. <s>sew</s>. <s>para</s>.)
new: (1595) {@<s>“vicCa gatO”</s>@}¦ (VI<s>-tudAdiH</s>-1423. <s>saka</s>. <s>sew</s>. <s>para</s>.)
; -----------------------------------------------
Reasons for print change above:
would break alphabetical order
Do we have a script that checks it? We do, right?
Notes on Sanskrit sorting.
Assume words to be sorted are in SLP1 transliteration.
This logic is aimed at Python3 code.
slp_from = "aAiIuUfFxXeEoOMHkKgGNcCjJYwWqQRtTdDnpPbBmyrlvSzsh"
slp_to = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvw"
slp_from_to = str.maketrans(slp_from,slp_to)
We can tell if string x is less than string y:
x.translate(slp_from_to) < y.translate(slp_from_to)
If 'a' is a list of Sanskrit words, we can sort by:
sort(a,key = lambda x: x.translate(slp_from_to))
The alphabetical ordering check for krm was done in krm_sense.py in function check_alpha.
The headwords in krm were found, by this check, to be mostly in alphabetical order, with the following exceptions:
order error: 54, [0052], asu >aMsa
order error: 105, [0102], uBa >ubja
order error: 130, [0132], fzI >f
order error: 258.1, [0262], kFY >kF
order error: 330, [0342], Kaca >KakKa
order error: 659, [0612], nawa >Rada
order error: 859, [0762], dfmPa >dfpa
order error: 1031, [0875], puzpa >puMsa
order error: 1614, [1242], vizka >vizx
order error: 1685, [1288], Samu >Sama
order error: 1698, [1292], Sasu >SaMsu
order error: 1789, [1328], Slokf >SoRf
order error: 1814, [1337], zadx >zada
order error: 1899.1, [1372], saSca >samI
order error: 1975, [1400], sraki >syala
order error: 1977, [1401], sranBu >sraMsu
order error: 1985, [1403], svarta >svada
The first exception order error: 54, [0052], asu >aMsa
means that
at L=54, at page 52 of the scanned images, is found entry 'aMsa'; and the
preceding entry is 'asu'. Since, according to the sanskrit lexicographical ordering,
'aMsa' precedes alphabetically 'asu', this is viewed as an ordering error.
mostly in alphabetical order, with the following exceptions:
Thanks for explaining. So these 17 cases were fixed or left as is?
sanskrit lexicographical ordering
You are aware that there is no one ordering, but at least two, with several smaller variations? Like some put the words with anusvara BEFORE, some AFTER where they belong. So @drdhaval2785 developed in 2014 one approach, but I do not see it documented in your simplified approach above.
KRM not modified. Am aware of the variations of ordering regarding anusvara. My simplified approach is the only one I use.
These changes made while comparing KRM headwords to the verbs of MW.
One is a print change -- poor quality print. Noted in krm_printchange file of csl-corrections.