Closed YijiaZHONG closed 1 year ago
My guess is that error means the model itself didn't encode it properly in UTF-8. I don't really know about the capabilities for Chinese in the models, although previously, I've seen mpt-7b-chat
output some Chinese (maybe that one is better?).
Now if you want to experiment with it, since this is Python, you can just go and edit the line where the error happens. You can try:
print(response.decode('utf-8', errors='ignore')) # alternatively: errors='replace'
see: https://docs.python.org/3.11/library/stdtypes.html#bytes.decode
Is it to change gptj = GPT4All ("ggml-gpt4all-j-v1.3-groovy") to gptj = GPT4All("mpt-7b-chat", model_type="mpt")?
I haven't used the Python bindings myself, just the GUI, but yes that looks about right. Of course, you'll have to download that model separately.
是否要将 gptj = GPT4All (“ggml-gpt4all-j-v1.3-groovy”) 更改为 gptj = GPT4All(“mpt-7b-chat”, model_type=“mpt”)?
我自己没有使用过 Python 绑定,只是使用 GUI,但是是的,这看起来是正确的。当然,您必须单独下载该模型。
ok,I see some model names by list_models() this function
Ah, actually, when looking in my file browser the file name is: ggml-mpt-7b-chat.bin
啊,实际上,在我的文件浏览器中查找时,文件名是:
ggml-mpt-7b-chat.bin
You can take a look at it based on the official example, with .bin removed from the code ... GPT4All("ggml-gpt4all-j-v1.3-groovy").list_models()
tried to errors='ignore' and get the same error
print(response.decode('utf-8'), errors='replace') ^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 1: unexpected end of data
==================================== Not sure whether the response send out the data in a right encoding. If I change the code to print response, it returns:
你好
b' \xe6'b'\x82'b'\xa8'b'\xe5\xa5'b'\xbd'b'\xe3\x80\x82'b'\xe6\x88'b'\x91'b'\xe4\xb9'b'\x9f'b'\xe6\x98\xaf'b'\xe8\xbf'b'\x99'b'\xe6\xa0'b'\xb7'b'\xe7\x9a\x84'b'\xe4\xba\xba'b'\xe3\x80\x82'
print(response.decode('utf-8'), errors='replace')
is incorrect. try:
print(response.decode('utf-8', errors='replace'))
I get for the first bytes string you posted with:
errors= 'replace'
: 您好。我也是这�errors= 'ignore'
: 您好。我也是这So that works for me.
Yes, now it works
Oh, I didn't notice it earlier and thought this was the first bytes string: b' \xe6'b'\x82'b'\xa8'b'\xe5\xa5'b'\xbd'b'\xe3\x80\x82'b'\xe6\x88'b'\x91'b'\xe4\xb9'b'\x9f'b'\xe6\x98\xaf'b'\xe8\xbf'b'\x99'b'\xe6\xa0'
But it's actually many individual ones: b' \xe6'
b'\xa8'
b'\xe5\xa5'
...
I just pasted that into a Python console to see, and Python automatically concatenates them together when doing that.
So what you probably want to do instead of printing the individual responses right away is to collect the full response into one big bytes string and then decode that. That should help with when individual Unicode code points are "cut in the middle".
是的,现在它可以工作了
But the results seemed to be different from what I expected
But the results seemed to be different from what I expected
What do you mean? Also, did you just use errors='replace'
or errors='ignore'
?
The model I use is "ggml-gpt4all-j-v1.3-groovy", and under the premise of changing print(response.decode('utf-8')) to print(response.decode('utf-8', errors='ignore')), I ask a question about python The answer is Python people. Python "Hello World!". ![Uploading QQ截图20230523191937.png…]()
Try a different model then. It depends a lot on what input a model was trained on. I don't know how much Chinese went into groovy
. Also, I'm not a Chinese speaker, either, so I can't really tell.
Maybe try mpt-7b-chat
.
Or maybe even wizardLM-7B.q4_2
. That one says it was created by people from Microsoft and the University of Beijing. I haven't tried that one myself yet, though.
OK thank you Model "mpt-7b-chat" I also tried, the problem is the same, there is garbled characters
As I said before:
Also, did you just use
errors='replace'
orerrors='ignore'
?
And what I meant in https://github.com/nomic-ai/gpt4all/issues/695#issuecomment-1559057008
It just gives back raw bytes in chunks, but not all raw bytes are valid Unicode characters. So for example, 是 encoded in UTF-8 bytes is b'\xe6\x98\xaf'
. But if one response is b'\xe6\x98'
and the second one is b'\xaf'
, you won't get the right result when using decode()
on them individually. You first have to put everything back together again, so that you have b'\xe6\x98\xaf'.decode('utf-8')
Yes, I also saw it when I was debugging, he is divided, so even if it is applicable in some cases, but if the content of the question is changed, there may still be a problem, and if the model is changed, it may not be compatible, so I haven't thought of a better solution yet
I'll have to play around with the Python bindings myself (but not right now, haven't set it up yet).
But basically what you have to do is wait for it to be finished with the response, then concatenate all the bytes, and only then decode
and print it. Not sure if there is an easy way to do that. Maybe the API needs a callback "done with the response", I don't know.
For the time being, I want to understand all the models first, and then choose the right model to debug and take a look
Actually, having a closer look with the example and the code, I think you can do something like:
Replace the DualStreamProcessor
with an io.BytesIO
so we just collect bytes without trying to convert to Unicode:
https://github.com/nomic-ai/gpt4all/blob/8e705d730d6240e4519e4a090f459a471443458f/gpt4all-bindings/python/gpt4all/pyllmodel.py#L198
replace with:
# stream_processor = DualStreamProcessor()
import io
stream_processor = io.BytesIO()
Replace the line in _response_callback
that would cause an error without errors='...'
and just use raw bytes:
# print(response.decode('utf-8', errors='replace'))
print(response) # now just writes to the BytesIO buffer
Then instead of returning stream_processor.output
here (BytesIO
doesn't have that):
https://github.com/nomic-ai/gpt4all/blob/8e705d730d6240e4519e4a090f459a471443458f/gpt4all-bindings/python/gpt4all/pyllmodel.py#L232
do this:
# return stream_processor.output
stream_processor.seek(0)
return stream_processor.read() # read all the bytes from the start
Finally, disable streaming in the example code, or what you're using to call the API and decode the bytes yourself:
messages = [{"role": "user", "content": "Name 3 colors"}]
response = gptj.chat_completion(messages, streaming=False)
print(response.decode('utf-8', errors='replace'))
Haven't tested that yet, but I'll install the Python bindings here and see if it works.
Edit 2023-06-03: This was done on an older version of the project (2023-05-23). Things have changed a bit since then, you might have to adapt some parts now or check out an old version instead.
Alright, I've tested it and this turned into quite a bit of a hack, but at least I made it work in the end. Here is my own example chat client:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from gpt4all import GPT4All
def main():
# Retrieve model
gptj = GPT4All("ggml-mpt-7b-chat.bin")
# Run model on prompt
messages = [{"role": "user", "content": "from now on, respond only in Chinese.\nHello"}]
response = gptj.chat_completion(messages, streaming=False)
print(response['choices'][0]['message']['content'].decode('utf-8', errors='replace'))
if __name__ == '__main__':
main()
here are the changes I made in pyllmodel.py
:
diff --git a/gpt4all-bindings/python/gpt4all/pyllmodel.py b/gpt4all-bindings/python/gpt4all/pyllmodel.py
index 6117c9f..8319fda 100644
--- a/gpt4all-bindings/python/gpt4all/pyllmodel.py
+++ b/gpt4all-bindings/python/gpt4all/pyllmodel.py
@@ -195,7 +203,9 @@ class LLModel:
old_stdout = sys.stdout
- stream_processor = DualStreamProcessor()
+ #stream_processor = DualStreamProcessor()
+ import io
+ stream_processor = io.BytesIO()
if streaming:
stream_processor.stream = sys.stdout
@@ -229,7 +239,9 @@ class LLModel:
# Force new line
print()
- return stream_processor.output
+ #return stream_processor.output
+ stream_processor.seek(0)
+ return stream_processor.read() # read all the bytes from the start
# Empty prompt callback
@staticmethod
@@ -239,7 +251,8 @@ class LLModel:
# Empty response callback method that just prints response to be collected
@staticmethod
def _response_callback(token_id, response):
- print(response.decode('utf-8'))
+ #print(response.decode('utf-8', errors='replace'))
+ sys.stdout.write(response) # now just writes to the BytesIO buffer
return True
# Empty recalculate callback
and I also had to comment out two lines in gpt4all.py
:
diff --git a/gpt4all-bindings/python/gpt4all/gpt4all.py b/gpt4all-bindings/python/gpt4all/gpt4all.py
index f24ee22..243f197 100644
--- a/gpt4all-bindings/python/gpt4all/gpt4all.py
+++ b/gpt4all-bindings/python/gpt4all/gpt4all.py
@@ -211,8 +212,8 @@ class GPT4All():
response = self.model.generate(full_prompt, streaming=streaming, **generate_kwargs)
- if verbose and not streaming:
- print(response)
+ #if verbose and not streaming:
+ # print(response)
response_dict = {
"model": self.model.model_name,
That's of course anything but user friendly, but for now it's better than nothing.
If you want to try yourself but don't know what to do with those things:
example.py
git
. Save them in separate files, for example as changes1.diff
and changes2.diff
. Then go to the repository's base directory and run git apply changes1.diff changes2.diff
. See this StackOverflow question for more information.python3 example.py
也许可以做的更彻底一点,
点击"chat_completion", 进入"gpt4all/gpt4all.py"文件
修改"chat_completion"函数的返回值"dict"
修改"chat_completion"函数中的两处"response",
然后可以这样调用
Fixed by #1281
System Info
MAC OS 13.1 13.1 22C65 Python3.11
Information
Related Components
Reproduction
By using below code:
import gpt4all gptj = gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy") messages = [{"role": "user", "content": "你好"}] gptj.chat_completion(messages)
It returns error like:
...python3.11/site-packages/gpt4all/pyllmodel.py", line 204, in _response_callback print(response.decode('utf-8')) ^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 1: unexpected end of data
Expected behavior
Should work properly