Closed pw136 closed 4 months ago
Same problem. I can adjust the prompt template but the tokenizer...?
Multilingual support depends largely on the model you use. The tokenizer is used purely for deciding how large chunks should be -- it does not actually relate to the model used to generate the outputs. If you use a model good with Chinese text and translate the prompts, it should work fine, in theory. Let me know how it works out! I'm curious about this use case myself.
Closing due to inactivity
Is it possible to support the conversion of Chinese text?