drdhaval2785 / SanskritSorting

Codes written by Dr. Dhaval Patel for Sanskrit Natural Language Programming
2 stars 1 forks source link

Three MW key2 Mistakes in Reverse Sorting #35

Closed gasyoun closed 9 years ago

gasyoun commented 9 years ago

1) ~

sArisfkta
sArisTA-KA~
sAru

https://github.com/sanskrit-lexicon/CORRECTIONS/issues/12 'DEVANAGARI SIGN CANDRABINDU' (U+0901) can't locate where it should be for IAST, so in my Charter font I've put it into  E5BE, private use character. Code2000 font has nothing candrabindu like, so can't comment. Wonder what Peter would say about it. I've added to my IAST font for book printing. But no standard I'm aware of. Dhaval, if you will use E5BE for ~, it will help me get it look good. @drdhaval2785 Will your sorting be able to manage it as well?

charter

2) @funderburkjim only Jim can tell what's wrong here 2

dA/Sa
dAS2a
dASI/
<H2A><h><hc3>102</hc3><key1>dASa</key1><hc1>3</hc1><key2>dAS2a</key2></h><body> <lex type="inh">m.</lex> <c>servant_,_slave</c> <ls>L.</ls>  </body><tail> <pc>476,3</pc> <L>91865.2</L></tail></H2A>

3) a -a (1);

a
a--kAra

Maybe a it's wrongly sorted as the very first word in the input list?

drdhaval2785 commented 9 years ago

Q 1: The issue is where should candrabindu sort at all?

drdhaval2785 commented 9 years ago

Q 3: input / output please. There was no issue in my output at http://drdhaval2785.github.io/accentsorted.html

funderburkjim commented 9 years ago

re 'dAS2a':

This is an error in key2. should be dA/Sa.

Made Correction.

drdhaval2785 commented 9 years ago

Q2 closed

gasyoun commented 9 years ago

A1: What if after anusvara? I mean starting a holly war because of a single letter does not makes much sense now, but having where it is right does not makes sense as well. What do you think, @drdhaval2785 A3: So there is something I need to understand myself. If a should be before ka in our case. I have no idea actually to be true.

drdhaval2785 commented 9 years ago

supplement to A3 - I think that all pure vowels should sorte before vowel+a combination for sure. So sorting 'a', 'A', 'i' ..... etc before 'ka' makes sense to me

drdhaval2785 commented 9 years ago

As discussed above, there were actually digits inside. No sorting issue.

gasyoun commented 9 years ago

@funderburkjim it's still there in the latest XML as well, search for dAS2a

<H2A><h><hc3>102</hc3><key1>dASa</key1><hc1>3</hc1><key2>dAS2a</key2></h><body>
gasyoun commented 9 years ago

@funderburkjim updated?