Vocab-Apps / pinyin-jyutping

Convert Chinese text to Pinyin or Jyutping
GNU General Public License v3.0
23 stars 2 forks source link

there is 1 redundant space before and after the punctuation #2

Open bk111 opened 1 year ago

bk111 commented 1 year ago
  1. after convert, there is 1 redundant space before and after the punctuation
  2. pinyin-jyutping needs double time than pinyin-jyutping-sentence
  3. pinyin-jyutping-sentence missed Chinese punctuation
  4. please try items1 = """25、这件事急不得,表面要装镇定,以免打草惊蛇。 21、这次行动千万要保密,不能打草惊蛇。 22、消息指她们都比平日"格外小心",以免打草惊蛇,故媒体也未能得知她们的身份。 """
bk111 commented 1 year ago

please check the picture https://forum.chinese-learning.me/viewtopic.php?f=5&t=417

luc-vocab commented 1 year ago

I understand what you mean now, each character of punctuation has a space that precedes it, which wasn't present in the chinese text. Let me think of a way to fix that. In your opinion, is it important to preserve the whitespace of the original text ?

FYI python-pinyin-jyutping won't be developed anymore, but I can try to improve the speed of pinyin-jyutping. Can you tell me what your expectation is in terms of speed ?

bk111 commented 1 year ago

I understand what you mean now, each character of punctuation has a space that precedes it, which wasn't present in the chinese text. Let me think of a way to fix that. In your opinion, is it important to preserve the whitespace of the original text ?

FYI python-pinyin-jyutping won't be developed anymore, but I can try to improve the speed of pinyin-jyutping. Can you tell me what your expectation is in terms of speed ?

if it's possible, please keep up same with the original text. Speed is ok. but why does the new version is slower than previous edition?

luc-vocab commented 1 year ago

What's your expectation for this input text ? 25、这件 I can produce the following output easily (space after the punctuation, but not before), which matches latin language convention 25、 zhèjiàn

another example: input: 請問,你叫什麼名字? output: qǐngwèn, nǐ jiào shénme míngzi

Let me know whether this would work for you.

bk111 commented 1 year ago

25、这件 25、 zhèjiàn # ------------this is wrong. (there is a redundant space before ’zhè‘) 25、zhèjiàn # ------------this is right.

input: 請問,你叫什麼名字? output: qǐngwèn, nǐ jiào shénme míngzi # -----------this is wrong. (there is a redundant space before ’nǐ‘) output: qǐngwèn,nǐ jiào shénme míngzi # -----------this is right.