Open ifsheldon opened 9 months ago
Hello,
I have almost the same experience with french, using local LLM (tested llama.cpp, ollama and vLLM so far, same issue).
As long as I speak in english with the bot, no issue, runs smoothly....but when I switch to french, I get lot of JSON parsing errors.
Unicode might not be the issue, it is the JSON parsing that seems to be...
Could be a LLM server issue as well...
@jmtrappier you can try the latest from source code. The published version is too old. Probably they are preparing a major breaking release.
I'm using 0.3.17. It seems that there are still some problems with multi languages json parsing.
> Enter your message: So if I said 你好 to you, what should you say.
[A[K
An exception occurred when running agent.step():
Traceback (most recent call last):
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\data_types.py", line 425, in to_google_ai_dict
function_args = json.loads(function_args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid \escape: line 1 column 55 (char 54)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\main.py", line 408, in run_agent_loop
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\main.py", line 377, in process_agent_step
new_messages, heartbeat_request, function_failed, token_warning, tokens_accumulated = memgpt_agent.step(
^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\agent.py", line 818, in step
raise e
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\agent.py", line 746, in step
response = self._get_ai_reply(
^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\agent.py", line 451, in _get_ai_reply
raise e
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\agent.py", line 426, in _get_ai_reply
response = create(
^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\llm_api\llm_api_tools.py", line 133, in wrapper
raise e
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\llm_api\llm_api_tools.py", line 106, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\llm_api\llm_api_tools.py", line 268, in create
contents=[m.to_google_ai_dict() for m in messages],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\llm_api\llm_api_tools.py", line 268, in <listcomp>
contents=[m.to_google_ai_dict() for m in messages],
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\idear\miniconda3\envs\dev\Lib\site-packages\memgpt\data_types.py", line 427, in to_google_ai_dict
raise UserWarning(f"Failed to parse JSON function args: {function_args}")
UserWarning: Failed to parse JSON function args: {"message": "\u4f60\u597d means Hello in Chinese? That\'s so cool! Thank you for teaching me."}```
I'm using 0.3.17. It seems that there are still some problems with multi languages json parsing.
Hi @xlbljz, thanks for the bug report! Based on your input, I realized that the new Gemini (and Anthropic) adapters didn't have the correct ensure_ascii
on the json.dumps
calls. I just went back and added those in this PR, which should hopefully fix your bug. Please let me know if it's still persisting on the nightly
of the new release when we tag it!
Is your feature request related to a problem? Please describe.
Multi-lingual/Unicode/I18n support will be great!
I was trying to use MemGPT in Chinese but I found that even GPT4 cannot understand Chinese, which is weird based on my experience.
Describe the solution you'd like
Full multi-lingual/Unicode/I18n support can be a bit complicated, but I think we can implement this step by step: (from easy to hard)
json.dumps()
needs to turn offensure_ascii
. This should enable LLMs like GPT4 that is capable enough to converse in multiple languages. #800