ambuda-org / vidyut

Infrastructure for Sanskrit software. For Python bindings, see `vidyut-py`.
49 stars 21 forks source link

vidyut-lipi doesn't handle Grantha/Tamil numbers correctly #85

Closed deepestblue closed 8 months ago

deepestblue commented 9 months ago

Grantha numbers (and Tamil numbers) aren't a place value system, and are instead representations of the Tamil words for the number. vidyut-lipi doesn't seem to have logic for that.

Transliterate ३२४५ from Devanagari to Grantha.

Expected: ௩௲௨௱௪௰௫ Actual: ३२४५

You can find many more test cases in https://github.com/deepestblue/saulabhyaJS/blob/main/test/saulabhya.test.js#L147

akprasad commented 9 months ago

. (wrong issue)

akprasad commented 9 months ago

Thanks for filing this issue. I think this is important to support, though I notice that this behavior is not implemented by Aksharamukha or indic_transliteration, so feel free to file issues against those projects as well.

I will probably model this as a default-true flag that the user can disable if so desired.

akprasad commented 9 months ago

@deepestblue I have some local code that passes your test suite for converting to and from Grantha numbers.

Some questions:

  1. How should an invalid number be handled?
  2. Are there contexts where place value notation would be valuable instead (Modern example: printing out a serial number)? If so, how should these be represented?
  3. Is this system used in modern Tamil as well? I see mixed evidence, e.g. here I see 100, , and க00 all used.
deepestblue commented 9 months ago

Good questions :-)

  1. Feels like a quality of implementation issue? GIGO? I guess better than that would be to signal an error in whatever mechanism vidyut-lipi uses.
  2. I'm not sure I've seen Grantha used in modern contexts like that, so I'm not sure this is important.
  3. Yes, it is, but Tamil numbers are rarely used and aren't taught in schools even in Tamil Nadu (not sure about Sri Lanka, Singapore etc.), so I'm not surprised if there's dwindling awareness.
deepestblue commented 9 months ago

BTW, https://omniglot.com/language/numbers/malayalam.htm seems to suggest Malayalam numbers are similar, but https://en.wikipedia.org/wiki/Malayalam_numerals says that's an archaic mechanism, and the new one is a simple place-value system. Not sure ...

akprasad commented 8 months ago

I've completed my local setup and will close this issue when the code is complete. For the record:

  1. For now, I don't have any special error handling for this and take a GIGO approach.
  2. Confirmed.
  3. For now, I've limited this code to just Grantha, but the logic should be easy to adapt to other scripts that need it.
akprasad commented 8 months ago

Pushed and deployed to our online demo.

deepestblue commented 7 months ago

Looks good to me! All my tests pass :-)