Open LLauryn opened 5 years ago
Thanks. Most of them are used for names. I fixed the bug so update the library to check the new results. Some of them are still missing because they are not in cedict. Let me find a solution to this in the near future.
@Kyubyong Thanks for your impressive work. I also found Some Chinese words are not included in the module, such as "琊". Cound you update and include these missing Chinese words?
thanks
Another question, how many Chinese word is included in the model? Cound you include the full Chinese Dictonary? Thanks
The library of Chinese grapheme-to-phoneme conversion is not complete. I have found part of missed Chinese words: 邓,吴,鄂,皖,蔡,萨,廖,宋,秦,刘,滧,闫,陕,郑,郝,犇,鹏,陇,祾,渭,邹,濮,梵,佟,韩,龚,洛,湘,婍,沂,隋,洣,潘,蒋,禹,喲,闽,湳,綪,睍,孻,汶,杭,吶,黔,渝,辽,銶,滇,灞,溁,浙,渤,邵,赣,淮,郸,彭,傣,蜀,沪,癍,郦,滕,滦,榣,姈,亳,漳,邢,涪,尧,昝,羲,媃,粤,鞑 from g2pc import G2pC g2p = G2pC() print(g2p("吴")) e.g. When I input the text "邓小平", the result for "邓" is ('邓', 'nr', '邓', '邓', '', '邓'). When I input "吴", the result is ('吴', 'nr', '吴', '吴', '', '吴'), etc. All of words I post have the same problem like the examples above.