Open kissyy16 opened 3 months ago
Hey @kissyy16! 👋 I'm here to help you out with any bugs, questions, and even becoming a contributor. Let's tackle this issue together!
The issue where the Q&A segment format in Dify version 0.6.12-fix1 shows nothing when switched to Q&A mode, despite the default paragraph preview displaying normally, might be due to the rate limit being reached for the Embedding model API key used. This can prevent the Q&A pairs from being generated and displayed correctly [1].
To resolve this, check if the rate limit has been reached for the Embedding model API key. You can monitor or adjust these rate limits by referring to the official documentation of the embedding model provider you are using. For example, if you are using OpenAI, you can check their rate limits and adjust your usage accordingly by visiting OpenAI's rate limits documentation [1].
Additionally, ensure that your API key is correctly configured in your self-hosted Docker environment. You can verify or update the API key by navigating to Settings > Model Provider > Hugging Face > Model Type
and checking the API Token field [2].
To continue talking to Dosu, mention @dosu.
I use the local LLM and embedding model
The English-language document can be properly segmented, but when uploading a Chinese content document, it defaults to the "Chinese" option. The dropdown menu offers both Simplified and Traditional Chinese as options. Choosing either Simplified or the default "Chinese" does not allow for segmentation.
Please note that this appears to describe a software feature in terms of language selection and text formatting issues.
Documents using Chinese content can also be segmented when choosing English, but it doesn't work if Chinese is selected. Could this be a bug?
@JohnJyong Only choose Chinese or Chinese Simplified or Chinese Traditional will be empty. The other languages are worked
@JohnJyong modify request params doc_language="ch" works. You can try.
When the language mark is Chinese, it seems that your llm does not perform very well when generating qa.
could you pls try other models, such as gpt4 ?
too. I use qwen:32b as llm model, and use bge-large-zh-v1.5 as embedding model.
about mouth ago, it still work correct.
but it does't work now.
Now i use qwen2-7b llm model and bge-large-zh-v1.5 embedding model, in Q&A item, I use Chinese Traditional mod word correct.
I
Now i use qwen2-7b llm model and bge-large-zh-v1.5 embedding model, in Q&A item, I use Chinese Traditional mod word correct.
indexing too slow,how to improve?
@JohnJyong modify request params doc_language="ch" works. You can try.
How to set this parameter
Has the problem been solved at last? I am also using the "BGE-base-zh-V1.5" embedded model, and the segmented use of English normal simplified Chinese failed
我也遇到了同样的问题,简体中文没法实现QA分段,嵌入模型是:chevalblanc/acge_text_embedding,dify版本是:0.6.14 但是我一个月前是可以正常使用的。
I also encountered the same problem. Simplified Chinese cannot implement QA segmentation. The embedding model is: chevalblanc/acge_text_imbedding, and the dify version is 0.6.14 But I was able to use it normally a month ago. @CharlesSong @crazywoola
I had same issue. This issue caused by _format_split_text & format_split_text. If the input text have special character as "*" (some LLM model returns result in markdown format), these function will return empty list -> qa_preview will be empty. And one more thing, if the output not in format: Q1:\nA1:\nQ2:\nA2:... qa_preview also empty.
Self Checks
Dify version
0.6.12-fix1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Default paragraph preview displays normally, but when I select the Q&A segment format and switch to Q&A mode, it shows nothing.
✔️ Expected Behavior
Correctly splitting Q&A format
❌ Actual Behavior
No response