milvus-io / bootcamp

Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
https://milvus.io
Apache License 2.0
1.88k stars 579 forks source link

How can I modify the code of the latest version of the QA system to make it support Chinese? #548

Closed hygithub94 closed 3 years ago

Bennu-Li commented 3 years ago

项目中使用的自然语言模型是 Sentence-Transformers 库中的模型 paraphrase-mpnet-base-v2. 你仅需要将该项目中的模型换成 Sentence-Transformers 库中支持中文的模型即可,比如说 distiluse-base-multilingual-cased-v1,同时将配置文件 config.py 中的参数 VECTOR_DIMENSION 修改为该模型产生的向量维度,例子中的模型生成的向量维度为512维。此外,你不需要修改其他代码。

模型详情可参考: Sentence-Transformers:https://www.sbert.net/docs/pretrained_models.html#sentence-embedding-models 模型下载: https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/

shiyu22 commented 3 years ago

not activate, closed.