Closed yiouyou closed 3 years ago
Describe the bug
Here is what I've done to modify the "jina-wikipedia-sentences" example, and try to apply it from English to Chinese:
- Change the requirements.txt to add paddlehub and paddlepaddle
jina[devel,torch,hub]==0.9.0 transformers==3.5.1 kaggle==1.5.10 paddlehub==1.7.1 paddlepaddle==1.8.5
- Change encode.yml to TextPaddlehubEncoder, and try to choose chinese-roberta-wwm-ext-large model
!TextPaddlehubEncoder with: model_name: chinese-roberta-wwm-ext-large
- Prepared some Chinese text data
[zh-input_20.txt](https://github.com/jina-ai/jina-hub/files/5810429/zh-input_20.txt)
- 'python app.py index' is done successfully
- However, 'python app.py search' meets some error when search some Chinese words in example.html:
Any idea why?
Thanks! Describe how you solve it
Environment
Screenshots
Hey @yiouyou , can u try if this also happens with version 0.9.13?
Hey @yiouyou , can you let me know how many documents you are indexing and how many shards
are you using?
The problem seems go way by adding more documents.
The problem seems go way by adding more documents.
Even with the small amount of documents #1689 should have fixed this, you are welcome to try with the latest version
With jina 0.9.16, "TypeError('zip argument #2 must support iteration')" goes away. To avoid empty shard, I'm using 400 song's lyrics now. But still has some problems to get chinese search done. The fresh detail steps are listed in https://github.com/jina-ai/examples/issues/350
Describe the bug
Here is what I've done to modify the "jina-wikipedia-sentences" example, and try to apply it from English to Chinese:
Change the requirements.txt to add paddlehub and paddlepaddle
Change encode.yml to TextPaddlehubEncoder, and try to choose chinese-roberta-wwm-ext-large model
Prepared some Chinese text data
'python app.py index' is done successfully
However, 'python app.py search' meets some error when search some Chinese words in example.html:
Any idea why?
Thanks! Describe how you solve it
Environment
Screenshots