hedyorg / hedy

Hedy is a gradual programming language to teach children programming. Gradual languages use different language levels, where each level adds new concepts and syntactic complexity. At the end of the Hedy level sequence, kids master a subset of syntactically valid Python.
https://www.hedy.org
European Union Public License 1.2
1.32k stars 289 forks source link

[FIX] Properly use non-latin numbers in input and output #1043

Closed Felienne closed 2 months ago

Felienne commented 3 years ago

We now support non-Latin variable names but... some languages have different numbers characters too, i.e. in Hindi (code for level 21):

print('५+३ क्या है ?')
        उत्तर = ५+३
        print('अब उत्तर है:')
        print(उत्तर)
        if उत्तर == ८:
            print('यह सही है!')
        else:
            print('अरे नहीं! यह गलत है!')

This does not work because 1) we have no support in the grammar and 2) the transpiler also needs to know the values to translate this to Python (assuming Python does not support this? Maybe it does?)

This is a hairy issue but interesting!

May 2022

Some updates on the progress here, #1929 and #2722 has made some more progress towards this, but we still need:

bjorn3 commented 3 years ago

assuming Python does not support this? Maybe it does?

It doesn't support this.

>>> ५+३
  File "<stdin>", line 1
    ५+३
    ^
SyntaxError: invalid character '५' (U+096B)
Felienne commented 3 years ago

assuming Python does not support this? Maybe it does?

It doesn't support this.

>>> ५+३
  File "<stdin>", line 1
    ५+३
    ^
SyntaxError: invalid character '५' (U+096B)

Pity but thanks for checking!!

eddieantonio commented 2 years ago

Just checked and this DOES work:

>>> int("५") + int("३")
8

Which means it's a matter of letting the parser accept non-ASCII digits!

For reference, Python's int() parses non-ASCII digits by transliterating them to ASCII digits, one by one. Ultimately, this delegates to unicodedata.decimal().

>>> import unicodedata
>>> unicodedata.decimal("५")
5
Felienne commented 2 years ago

Just checked and this DOES work:

>>> int("५") + int("३")
8

Which means it's a matter of letting the parser accept non-ASCII digits!

For reference, Python's int() parses non-ASCII digits by transliterating them to ASCII digits, one by one. Ultimately, this delegates to unicodedata.decimal().

>>> import unicodedata
>>> unicodedata.decimal("५")
5

Thanks a lot for that Eddie, but sadly, Python implements this but Skulpt (which we use to run Python in JavaScript) does not 😭 image

So I guess I will have to shoestring this together myself

eddieantonio commented 2 years ago

So I guess I will have to shoestring this together myself

So I learned the hard way that parsing UnicodeData.txt is not super fun. Here's a table of numerals and their digit value (including Eastern Arabic, Hindi/Devanagari, and Bengali) from the Unicode Character Database: https://github.com/eddieantonio/numerals-in-unicode/blob/main/Numerals.ipynb

Felienne commented 2 years ago

So I guess I will have to shoestring this together myself

So I learned the hard way that parsing UnicodeData.txt is not super fun. Here's a table of numerals and their digit value (including Eastern Arabic, Hindi/Devanagari, and Bengali) from the Unicode Character Database: https://github.com/eddieantonio/numerals-in-unicode/blob/main/Numerals.ipynb

Would love to hear about your battle scars one day!

Felienne commented 5 months ago

@boryanagoncharenko I think you are now working on this one!

Felienne commented 2 months ago

Hi @boryanagoncharenko!

Not sure what input form me is needed on this one?

boryanagoncharenko commented 2 months ago

Hi @boryanagoncharenko!

Not sure what input form me is needed on this one?

Perhaps it is a good idea to make this story more concrete. Currently, the situation is that we already have a solution that converts numbers between different numeral systems. It works but we don't like it a lot and we have to maintain it. So then the questions are:

Or maybe there anything else implied by the phrase "properly use" in the title?

Felienne commented 2 months ago

Hi @boryanagoncharenko! Not sure what input form me is needed on this one?

Perhaps it is a good idea to make this story more concrete. Currently, the situation is that we already have a solution that converts numbers between different numeral systems. It works but we don't like it a lot and we have to maintain it. So then the questions are:

  • Is there anything about our solution that needs to be improved? Maybe we don't support some numeral systems or our conversion method fails in certain cases?
  • Of course we would prefer if this conversion is not our responsibility and is maybe just embedded in Skulpt. Is it worth pursuing this path?

Or maybe there anything else implied by the phrase "properly use" in the title?

Yeah this issue is extremely old, has been partly addressed, and is not so well-phrased anymore. I will close it and split up the remaining work in a few smaller items!