sanskrit-lexicon / MWS

Monier Monier-Williams, Sir; A Sanskrit-English dictionary. Oxford, 1899
7 stars 5 forks source link

'Bad' Side-effects of <ab n="> and <ls> tagging #162

Open Andhrabharati opened 3 months ago

Andhrabharati commented 3 months ago

It is seen that at many tagged places, the punctuation mark(s) [esp. the 'comma'] got deleted in the text.

Also noticed that at many places either the plural letter 's' is added erroneously at many places or deleted at many places.

Finally, there are places that got expanded leaving the contraction mark '°' altogether; and there are quite a few places that got expanded wrongly.

The ONLY way to make things proper in all these cases (running into few thousands), is a FULL reading of the data.

Andhrabharati commented 3 months ago

I've reopened MW, after a long gap of one year, NOW (for revising in AB style).

funderburkjim commented 3 months ago

plural letter 's' is added erroneously at many places or deleted at many places

A coding convention was devised to deal with these 'plurals of IAST Sanskrit words'

296 matches in 290 lines for "<ab n="[^<]*</ab>s" in buffer: mw.txt

Example: the school of the <ab n="Taittirīya" slp1="tEttirIya">T°</ab>s.

Here is display with tooltip:
