[FEAT]: Before LanceDB Output to LLM Text encoding Big5

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.

MIT License

17.04k stars 1.82k forks source link

Description

Use Traditional Chinese files to upload to the built-in LanceDB~ As a result, the attached file found a possible answer ~ but the answer did not provide the answer.

Confirmed data in LanceDB For example: \storage\lancedb\security_2.lance\data Chinese-related file content, data in LanceDB It may be more correct to convert it to Big5 and then give it to LLM.

Regarding text encoding issues, is it possible to provide solutions or encoding options before retrieving the results to derive the LLM model?

Mintplex-Labs / anything-llm

[FEAT]: Before LanceDB Output to LLM Text encoding Big5 #1673

Description