Open cdhigh opened 1 year ago
I found some unicode strings in conversations.json. conversations.json.zip
Is it because the file was saved using Unicode encoding, and the decoding failed due to the lack of the original encoding during reading?
I suspect the unicode encoding in the conversations.json file you attached are more likely due to keystrokes you (maybe inadvertently) sent while interacting with the script, rather than non-printable characters being part of the conversation.
I used your json and could playback the chats just fine, can't reproduce the jq
error you got in the first post. I also tried starting a new conversation using the characters you mentioned specifically, without any issues. Can you please tell me the steps to reproduce this jq
error?
Here's the test I did:
Enter "como dancar lindo" to reproduce the issue.
Or can you remove all characters less than 0x1f from the user-entered string?
Sorry, I can't reproduce. Just copy/pasted the sentece you suggested several times and got no errors. May this have to do with the Kindle underlying OS? Maybe a localization issue? Have you tried defaulting to a UTF-8 locale? I have my default locale set on en_US.UTF-8
if that may help.
Or can you remove all characters less than 0x1f from the user-entered string?
that should not be necessary since the script already "stringifies" the input using jq -Rs
here - that include handling special characters whose value is less than 0x1f, e.g. 0xA for newline (LF)
I tried some tricks by searching on google, still no luck. Are we able to determine which one is causing the error among multiple jq statements in the code?
you may try to run the script using bash -x ai "como dancar lindo"
to see exactly what gets called, which variables get set and all the output.
caught this line! https://github.com/nitefood/ai-bash-gpt/blob/22e77b1a0c8c1aab6a2ea7fdcfa0a018b42e62a2/ai#L735C5-L735C61 735 response_text=$(jq -r '.content' <<<"$response_message")
you may try placing a hexdump -C <<<"$response_message"
before that line to see exactly what control character jq
is complaining about, then run the script normally
a lot of pages, can hexdump to a file?
I don't know the first thing about bash and shell~~~ I have some experiences in field of C/C++ and python
complaining in line 17 col 178.
This is another dump, line 11 col 151.
After continuous searching and experimentation, it was discovered that the error was caused by a restriction in the JSON data where line breaks were not allowed. If a line break was present, it needed to be escaped.
By modifying the following line of code:
https://github.com/nitefood/ai-bash-gpt/blob/22e77b1a0c8c1aab6a2ea7fdcfa0a018b42e62a2/ai#L735C5-L735C61
735 response_text=$(jq -r '.content' <<<"$response_message")
to
response_text=$(jq -Rnr '[inputs] | join("\\n") | fromjson | .content' <<<"$response_message")
The issue was totally resolved!
The code line is come from stackoverflow.
However, I am still don't know of the exact meaning of the modified line of code~~~
PS: I don't have glow installed in Kindle, maybe this is the reason you cannot reproduce the issue.
Is there a way to perform pre-escaping of characters when using non-english (portugues etc), which contains some non-ASCII characters like ç, ã, õ, etc.? Sometimes, JSON parsing errors occur that prevent the display of ChatGPT's responses, for example:
jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 13, column 93.
Is there any method for pre-escaping?
edited: If there are special characters in the text that needs to be sent, this error can also occur.