Open brynne8 opened 4 years ago
Since it's an interesting task parsing Chinese numerals, I wrote a simple one in PEG using LPeg.re.
Link: chinese_number.lua
Thanks for pointing out the issues! The Chinese numerals have always been the hard part.
console.log
, but on the online IDE, I hijacked/monkey-patched the console.log
to print to a <div>
, in which I added the feature of rendering numbers as hanzi. For arrays, technically I can traverse all the datastructre and recursively change everything to hanzi, but it creates some display issues when the output Array is very long - I'll correct for that in the next online IDE update.Thank you!
I would propose a new approach.
How about we implemented a print
function in the standard library and print numbers and others to hanzi. And by default, 書之
will call that function. This can outputs numbers to hanzi without hijack in the ide and will work everywhere. Besides, another syntax may be needed to be introduced as 記之
or something for the raw output of the target language( works as the current 書之).
I am not very good at wenyan so please feel free to make suggestions to the wording.
* 一百一: 101 was the original behavior, but changed as requested by this issue: #24 . 二百五=250 sounds more common though.
Surely it sounds more common, but it's in spoken language. Actually too spoken. Formally, 二百五 should be 205
我这有个算法不知道有没有漏洞:
从左往右读, 每一读一位乘十加上后一位, 但如果是倍数词那得乘上相应的倍数
然后读到 <EOS>
额外检测, 如果是不是十那么乘十.
因为只有 二百五
, 没有 二百五万
, 只能读成 二百五十万
.
这个算法好处是同时支持 一零九九
和 一千零九十九
两种读法.
一个 python 的示例实现如下:
https://github.com/GalAster/WenyanLanguage/blob/master/packages/wenyan-parser-py/source/hanzi2num.py
十二
which means 12, is ouput as一十二
, which is not the common form.一百一
should be parsed as 101 instead of 110. But wenyan-lang seems to do the latter.