-
ICU is not a good choice in China. In addition, it is very important for Chinese word segmentation to customize the dictionary, because the application of words in different industries is completely d…
-
Due to bundling, fs reads are hard to do on vercel. My current solution now is // try {
// const response = await fetch('/dict.txt.big'); // Adjust the path if necessary
// const dictText = await …
-
# Simple: SQLite3 结巴分词插件 :: Wang Fenjin
SQLite3 中使用结巴分词实现更精准中文搜索
[https://www.wangfenjin.com/posts/simple-jieba-tokenizer/](https://www.wangfenjin.com/posts/simple-jieba-tokenizer/)
-
请问能不能把python的jieba_fast的分词词性标注的dict.txt文件全部拷到 (dict/user.dict.utf8),来保持和jieba_fast词性标注一致?
-
The jieba-rs support tagging part-of-speech by `tag` function, could you export it to this library?
-
您好, 很感激您对jieba go版本的创作。有一个小问题,想让您看下是不是bug~
```
// step1: new jieba object
jbt.jieba = gojieba.NewJieba(
"../data/jieba.dict.utf8",
"../data/hmm_model.utf8",
"../data/user.dict.utf8",
…
-
這個研究滿有趣的!
參考中央研究院成果,可以分析更大量文本且更精確,如下
`https://github.com/ckiplab/ckip-transformers#models`
推薦 Albert-tiny,load 模型快速,日常生活等級的文本也不是問題。
如果需要教學 notebook 或有任何問題儘管問!
-
----报错信息
double free or corruption (out)
SIGABRT: abort
PC=0x7f83c2ca9acf m=7 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 5 [syscall]:
runtime.cgocall(0x52cff0, …
-
Hi, I would like to use this package to help with Chinese learning. I would be willing to help with development, but might need some pointers. I would probably use `jieba` for tokenization. Please let…
-
___go_build_womata_service_sys(79285,0x173a4b000) malloc: *** error for object 0x60000a7dc900: pointer being freed was not allocated
___go_build_womata_service_sys(79285,0x173a4b000) malloc: *** set …