wordplaydev / wordplay

An accessible, language-inclusive programming language and IDE for creating interactive typography on the web.
Other
64 stars 40 forks source link

Korean numerals #3

Closed amyjko closed 5 months ago

amyjko commented 1 year ago

Add support for Korean numerals, alongside Chinese and Japanese.

jamiehankim commented 6 months ago

Hi Professor, Jimin, Suh Yong, and I are fluent in Korean so we would like to know what to do for this issue and where to start. Please let us know.

amyjko commented 6 months ago

Yay, it'll be great to include these numerals!

The first key thing to understand is that programming languages parse program text into tree data structures. Before they do these, they tokenize text into sequences of symbols that represent individual pieces of a program, like name, (, or :. Korean numerals need to be added as a new kind of token in Wordplay, like Arabic numerals and Japanese numerals are already supported.

Here the places that need to be updated:

Throughout, be sure to be running npm run test and 'npm run check' to make sure that your changes are passing all tests and not introducing any TypeScript errors.

I think that should get you started! You can submit draft PRs with questions or requests for feedback, or comment here.

anhourandaquarter commented 6 months ago

Context: To my knowledge, there aren't separate symbols for numbers in Korean, outside of the number-words. Korean uses Hindu-Arabic or literary Chinese (Hanja) for numbers.

There are also two systems of representing numbers: the Native Korean system and the Sino-Korean system. And not to mention the number unit suffixes for different categories of objects! (For example, groups of people have the suffix -myeong; animals have -mari; etc.)

Questions: 1) When creating support for Korean numerals, does this mean supporting both systems of expressing Korean number words? 2) Should unit suffixes also be considered?

amyjko commented 6 months ago

Aren't these Korean numerals?

https://en.wikipedia.org/wiki/Korean_numerals

What to numerals to support is a design consideration; added the needs design label. What support for Korean numerals would be reasonable for all Korean fluent creators?

Units are supported separately from numerals in Wordplay. For example, 1cat is a valid number, with the unit cat. Try it that with Korean unit suffixes; that should already work with Arabic numbers.

anhourandaquarter commented 6 months ago

Aren't these Korean numerals?

Those are! My question was that those are number words, like "one" instead of 1. I tried converting between "one" and 1 in Wordplay, and that conversion wasn't supported ("one" -> #). I don't know if this conversion should work for the Korean number words.

What support for Korean numerals would be reasonable for all Korean fluent creators?

I would say that both systems would be important, because Korean speakers use both systems pretty regularly. But I think this also depends on whether we want number-words to have the same meaning as number-symbols. If we use the Hindu-Arabic numerals, then I'm not sure it would matter too much? Also happy to have more input on this!

Units are supported separately from numerals in Wordplay.

Yep, these work as expected!

amyjko commented 6 months ago

Ah, thanks for the clarification. Number words in Wordplay are interpreted as identifier names to store values, e.g., one: 1 stores the value 1 in the name one, numbers: [1 2 3] stores the value [1 2 3] in the name numbers. The value "one" is just the text value "one"; text values aren't interpreted to have any special meaning, though you could open another issue on text to number conversions. That would be a tricky design and engineering challenge though, converting arbitrary text to numbers! There are so many different words to cover.

Can you say more about why number word support matters for number symbols? Or are they two independent concerns?

anhourandaquarter commented 6 months ago

Can you say more about why number word support matters for number symbols? Or are they two independent concerns?

I would say they're two independent concerns. Definitely would be more challenging, since that would be related to the conversion between text and numbers!

I think my question now is, since Korean doesn't use separate number symbols the way that some other languages do, is this an issue that still needs work? Thanks!

amyjko commented 6 months ago

I guess I'm still confused (probably since I don't know Korean). The closest language I know is Japanese, which does use number symbols like 一二三四五六七八九十百千万 to represent numbers; Wordplay reserves these to represent them as numbers, rather than as words, so that they can be mixed with modern arithmetic syntax, e.g., 一 + 一 evaluates to .

Can Korean numerals used in the same way, or are they only used as words in sentences, and never to represent numbers in arithmetic expressions? Would it be inappropriate to say 둘 + 둘 to yield ?

anhourandaquarter commented 6 months ago

Would it be inappropriate to say 둘 + 둘 to yield 넷?

I don't think it'd be inappropriate, per se, but probably uncommon (like how we probably wouldn't write "one + one" in English). I think it may have something to do with the fact that Korean arithmetic is done using the Sino-Korean number system, which usually uses the Hindu-Arabic number symbols. (둘 and 넷 both come from the Native-Korean system and are more commonly used outside of a mathematical context.)

I'll do some more research into the usages to figure out what will be appropriate!

amyjko commented 6 months ago

Thanks, that's a helpful explanation. Part of what we're debating here is how to decolonize mathematics, which is very English and Arabic. That might mean reclaiming some parts of arithmetic and algebra for other languages. But there's definitely a cultural judgement to make here, but in the context of computation.

One other way of thinking through this is whether Korean number words should be interpreted in Wordplay programs as numbers or names. Because those are the two choices: either they're names, and we can assign values to them, e.g., 둘: 2 assigns the value 2 to the name , or they're numbers, in which case we would just say to represent the number 2, but we wouldn't be able to write 둘: 2, because that wouldn't be a valid name. Which design would be 1) more reasonable for Korean, and 2) more useful for writing programs?

jsung1014 commented 6 months ago

@amyjko For this project, it's more practical to use Sino-Korean numbers (일, 이, 삼, 사) as they are commonly used in Korea for mathematics, especially when handling large numbers. Additionally, it's important to acknowledge that the Korean education system primarily uses Arabic numerals, not textual representations of numbers. Having lived in Korea for most of my life, I've never seen anyone who uses Pure-Korean numbers like "하나 둘 셋 넷" in mathematics or educational context. Therefore, in our programming environment, treating Korean number words directly as numbers—where '이' represents '2'—aligns with both conventional usage and educational practices. To prevent any confusion when it comes to sentences, we can use single quotes to distinguish numbers from any context. Please let me know your thoughts!

jsung1014 commented 6 months ago

@amyjko Additionally, I conducted an informal survey among my friends in Korea, including both those who are studying coding-related fields and those who are not. They all agree that Arabic numerals are most familiar for learning programming. However, if we have to translate these into Korean, they would prefer using the Sino-Korean number system over the Pure Korean system.

jsung1014 commented 6 months ago

@amyjko I noticed that this issue has been tagged with a "needs design" label, but I am unclear about the specific design requirements it entails. Could you please clarify whether this label refers to the design of Korean number systems specifically, or if there are other design aspects you would like us to consider as a design team? Your guidance would be invaluable in helping us address and devise an appropriate solution!

amyjko commented 6 months ago

What we are talking about right now is the design. We're designing how and whether to include Sino-Korean numbers in the Wordplay programming language. So a design proposal would include a detailed description of what should change about the programming language's syntax to support Sino-Korean numbers.

anhourandaquarter commented 5 months ago

Closing this issue as of 1 May 2024.

Resolution: We will not be adding Korean numerals at this time. Korean users can use Hindu-Arabic numerals and declare variables in Korean that match Korean number words (e.g., variables like "one" or "하나").

Reasoning: After Ji Min, Jamie, and I conducted an informal survey among other Korean speakers and users, we confirmed that Korean number words are usually not used in place of the numbers themselves (e.g., "one" instead of 1). As such, we believe it will be most user-friendly to continue using Hindu-Arabic numerals. Users can still create variable names or units in Korean, and use these per normal.

Next steps: Our team will continue work on issue #448, which was identified during the process of reaching a resolution on this issue.