wenyan-lang / wenyan

文言文編程語言 A programming language for the ancient Chinese.
https://wy-lang.org/
MIT License
19.62k stars 1.1k forks source link

On Chinese numerals #338

Open brynne8 opened 4 years ago

brynne8 commented 4 years ago
brynne8 commented 4 years ago

Since it's an interesting task parsing Chinese numerals, I wrote a simple one in PEG using LPeg.re.

Link: chinese_number.lua

LingDong- commented 4 years ago

Thanks for pointing out the issues! The Chinese numerals have always been the hard part.

Thank you!

antfu commented 4 years ago

I would propose a new approach.

How about we implemented a print function in the standard library and print numbers and others to hanzi. And by default, 書之 will call that function. This can outputs numbers to hanzi without hijack in the ide and will work everywhere. Besides, another syntax may be needed to be introduced as 記之 or something for the raw output of the target language( works as the current 書之).

I am not very good at wenyan so please feel free to make suggestions to the wording.

SaltfishAmi commented 4 years ago
* 一百一: 101 was the original behavior, but changed as requested by this issue: #24 . 二百五=250 sounds more common though.

Surely it sounds more common, but it's in spoken language. Actually too spoken. Formally, 二百五 should be 205

oovm commented 4 years ago

我这有个算法不知道有没有漏洞:

从左往右读, 每一读一位乘十加上后一位, 但如果是倍数词那得乘上相应的倍数

然后读到 <EOS> 额外检测, 如果是不是十那么乘十.

因为只有 二百五, 没有 二百五万, 只能读成 二百五十万.

这个算法好处是同时支持 一零九九一千零九十九 两种读法.

一个 python 的示例实现如下:

https://github.com/GalAster/WenyanLanguage/blob/master/packages/wenyan-parser-py/source/hanzi2num.py