sanskrit-lexicon / hwnorm1

Headword normalization for Cologne dictionaries
0 stars 0 forks source link

fem. singular/plurals should be joined #9

Open funderburkjim opened 7 years ago

funderburkjim commented 7 years ago

Noticed an hwnorm1 deficiency in work on https://github.com/sanskrit-lexicon/CORRECTIONS/issues/341.

cAturmAsyakArikAH is the ACC spelling (f. plural)

while cAturmAsyakArikA (f. singular) is spelling in MW, PW.

These should generate only 1 entry in hwnorm1, rather than 2 entries.

Not sure how to handle this without introducing false positives.

gasyoun commented 7 years ago

These should generate only 1 entry in hwnorm1, rather than 2 entries.

Totally agree.

Let's go for sg., that should be added to https://github.com/sanskrit-lexicon/hwnorm1/issues/3 and https://github.com/sanskrit-lexicon/CORRECTIONS/issues/43

Not sure how to handle this without introducing false positives.

Let's move 1 by 1.

drdhaval2785 commented 3 years ago

Any list of words ending with 'AH' and the other one ending with 'A', @funderburkjim ? A list would help to come to some methodology to reduce the false positives.