pycantonese Search Results

jacksonllee/pycantonese #50

Undocumented differences between the HKCanCor corpus on Hugg…

The version of HKCanCor published on [HuggingFace](https://huggingface.co/datasets/nanyang-technological-university-singapore/hkcancor/tree/main) by NTU is different from the version offered by this l…

AlienKevin updated 2 months ago

jacksonllee/pycantonese #43

Segmenter removes space of English words in code-mixed sente…

**Describe the bug** Segmenter removes space of English words in code-mixed sentence, for example this sentence: > 這是Career Centre **To reproduce** Here is the code: ``` import pycantonese fr…

shivanraptor updated 5 months ago

jacksonllee/pycantonese #37

possible to add a custom lookup dict for characters_to_jyutp…

**Describe the bug** I read this and understand the corpora used for characters_to_jyutping are. (i) the HKCanCor corpus data included in the PyCantonese library, and (ii) the rime-cantonese data …

raymond00000 updated 1 year ago

jacksonllee/pycantonese #42

Does Word Segmentation give position of the vocabularies?

**Feature you are interested in and your specific question(s):** I'm studying Word Segmentation of PyCantonese (https://pycantonese.org/word_segmentation.html), does the function return also the star…

shivanraptor updated 1 year ago

jacksonllee/pycantonese #40

Error parsing hng6

**Describe the bug** A clear and concise description of what the bug is. Error thrown when calling pycantonese.parse_jyutping('hng6') **To reproduce** Steps to reproduce the behavior, including …

Keith-Hon updated 1 year ago

jacksonllee/pycantonese #45

Simplified Chinese characters not supported

I try to use the jyutping to convert characters to jyutping, but I found some character can be convert: for example: txt='昆省急救服务中心嘅医护人员昆省警方。' the output is: [('昆', 'gwan1'), ('省', 'saang2'), ('急救…

zhiqiuiyiye updated 5 months ago

jacksonllee/pycantonese #44

Jyutping to IPA support

**Feature you are interested in and your specific question(s):** Is there any method that does jyutping to ipa ? I know there's a jyutping to tipa method now, would be great if also have jyutping t…

rjrobben updated 3 months ago

jacksonllee/pycantonese #33

分詞器速度太慢

目前個 `.segment()`效率有啲低，好似唔係最優算法。@graphemecluster @zhanruiliang 之後可能會開個 PR 睇下點優化。另外順便解決埋 https://github.com/jacksonllee/pycantonese/issues/32 嘅分詞問題。

laubonghaudoi updated 2 years ago

briankung/cccedict #6

Good stuff

Probably inappropriate to make an issue here, but I am working on a parser with Nom for parsing CHIlDE for Jyutping, good work!

winston0410 updated 2 years ago

jacksonllee/pycantonese #51

New output style request of pycantonese.characters_to_jyutpi…

**Feature you are interested in and your specific question(s):** I want new output style request of pycantonese.characters_to_jyutping something like this: ``` >>>pycantonese.characters_to_jyutping…

hgneng updated 1 month ago

43 results for pycantonese

43 results
for pycantonese