Closed v1k70rk4 closed 1 year ago
Does OpenAI use different encodings to represent special characters with accents?
"Szoboszlai Dominik jelenleg 20 \xe9ves." - Here, the "\xe9" sequence represents the character "é". This format is typically used in the latin2 (ISO-8859-2) encoding.
"Szoboszlai Dominik jelenleg 21 \xc3\xa9ves..." - In this log message, the "\xc3\xa9" sequence represents the character "é", which is the format used in the UTF-8 encoding.
Therefore, I still don't understand completely...
Hello,
Firstly, thank you so much for your work! However, lately, there's been a disturbing error that I've been trying to figure out, but I can't seem to identify the issue.
The problem is that sometimes the UNICODE escape sequences are incorrectly converted to UTF8, and they appear as seen in the ###bad### example.
The issue comes up completely at random, and I'm unsure of how to handle it.
Thank you in advance for your response :)