Closed Felienne closed 2 months ago
assuming Python does not support this? Maybe it does?
It doesn't support this.
>>> ५+३
File "<stdin>", line 1
५+३
^
SyntaxError: invalid character '५' (U+096B)
assuming Python does not support this? Maybe it does?
It doesn't support this.
>>> ५+३ File "<stdin>", line 1 ५+३ ^ SyntaxError: invalid character '५' (U+096B)
Pity but thanks for checking!!
Just checked and this DOES work:
>>> int("५") + int("३")
8
Which means it's a matter of letting the parser accept non-ASCII digits!
For reference, Python's int()
parses non-ASCII digits by transliterating them to ASCII digits, one by one. Ultimately, this delegates to unicodedata.decimal()
.
>>> import unicodedata
>>> unicodedata.decimal("५")
5
Just checked and this DOES work:
>>> int("५") + int("३") 8
Which means it's a matter of letting the parser accept non-ASCII digits!
For reference, Python's
int()
parses non-ASCII digits by transliterating them to ASCII digits, one by one. Ultimately, this delegates tounicodedata.decimal()
.>>> import unicodedata >>> unicodedata.decimal("५") 5
Thanks a lot for that Eddie, but sadly, Python implements this but Skulpt (which we use to run Python in JavaScript) does not 😭
So I guess I will have to shoestring this together myself
So I guess I will have to shoestring this together myself
So I learned the hard way that parsing UnicodeData.txt
is not super fun. Here's a table of numerals and their digit value (including Eastern Arabic, Hindi/Devanagari, and Bengali) from the Unicode Character Database: https://github.com/eddieantonio/numerals-in-unicode/blob/main/Numerals.ipynb
So I guess I will have to shoestring this together myself
So I learned the hard way that parsing
UnicodeData.txt
is not super fun. Here's a table of numerals and their digit value (including Eastern Arabic, Hindi/Devanagari, and Bengali) from the Unicode Character Database: https://github.com/eddieantonio/numerals-in-unicode/blob/main/Numerals.ipynb
Would love to hear about your battle scars one day!
@boryanagoncharenko I think you are now working on this one!
Hi @boryanagoncharenko!
Not sure what input form me is needed on this one?
Hi @boryanagoncharenko!
Not sure what input form me is needed on this one?
Perhaps it is a good idea to make this story more concrete. Currently, the situation is that we already have a solution that converts numbers between different numeral systems. It works but we don't like it a lot and we have to maintain it. So then the questions are:
Or maybe there anything else implied by the phrase "properly use" in the title?
Hi @boryanagoncharenko! Not sure what input form me is needed on this one?
Perhaps it is a good idea to make this story more concrete. Currently, the situation is that we already have a solution that converts numbers between different numeral systems. It works but we don't like it a lot and we have to maintain it. So then the questions are:
- Is there anything about our solution that needs to be improved? Maybe we don't support some numeral systems or our conversion method fails in certain cases?
- Of course we would prefer if this conversion is not our responsibility and is maybe just embedded in Skulpt. Is it worth pursuing this path?
Or maybe there anything else implied by the phrase "properly use" in the title?
Yeah this issue is extremely old, has been partly addressed, and is not so well-phrased anymore. I will close it and split up the remaining work in a few smaller items!
We now support non-Latin variable names but... some languages have different numbers characters too, i.e. in Hindi (code for level 21):
This does not work because 1) we have no support in the grammar and 2) the transpiler also needs to know the values to translate this to Python (assuming Python does not support this? Maybe it does?)
This is a hairy issue but interesting!
May 2022
Some updates on the progress here, #1929 and #2722 has made some more progress towards this, but we still need:
non-latin decimals (level 12 and up)Fixed by #2741!!