Closed pythonmanGo closed 1 year ago
Same issue here. The Chinese version of answers from query_engine are much shorter than the English one, and usually truncated. This happened in the earlier version of the llama_index also, but fixed back then (load from the saved index file before querying). Not sure why this re-emerges...
Hi, @pythonmanGo! I'm Dosu, and I'm here to help the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you are experiencing issues with loading and caching the index in the Chinese question answering system. You mentioned concerns about ensuring the completeness of the answers provided by the system, and you are unsure about setting the max_tokens parameter and the split character. Another user, @madguyjoe, has also encountered a similar issue with truncated Chinese answers in a previous version of the system.
At this point, the issue remains unresolved. Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on this issue. If not, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and cooperation. We look forward to hearing from you soon.
If in a Chinese question answering system make sure the answers are complete. A few questions about the use of llama_index:
1: If I do not load md at once (there are many md files entered into the project as documents) or if I do not cache index.storage_context.persist(" dir ") locally, I may need to reload many md at a time, which will consume a large amount of tokesn. This is obviously not scientific.
2: If I serialized md files to the local cache first(serialized jason)。I only need use the token required to raise the problem each time, it is more economical in terms of token usage. However, I need to build the index by loading the cache:
storage_context = StorageContext.from_defaults(persist_dir=datadir)
The problem with the second method is that if the Chinese Q&A is used to feed the data, subsequent users will ask questions, but cannot give complete answers. Maybe it's because max_tokens is limited in size by default. But loading index through a local cache doesn't seem to reset max_tokens.I need to konw how to set max_tokens while user answer question, or mybe the cause is i do not set the correct split character?how to fix it .Let the system run correct.
code
` def LoadMixSearchGPT(dirdata):
def MixSerchGPT(prompt, datadir): prompt=prompt.strip()