zevlg / telega.el

GNU Emacs telegram client (unofficial)
https://zevlg.github.io/telega.el/
GNU General Public License v3.0
1.09k stars 85 forks source link

telega error=400 triggered for strings that seem to be UTF-8 #457

Closed akater closed 5 months ago

akater commented 6 months ago

Telega Setup

OS: Gentoo Linux Emacs: GNU Emacs 29.1.50 (build 1, x86_64-pc-linux-gnu, cairo version 1.17.8) Features: imagemagick svg ffmpeg Telega: telega v0.8.230 (TDLib v1.8.23-fb27c7c) (telega-server v0.8.2)

Current Behavior

I just tried to send the following message via telega

"  " " "

(The string has some weird chars in it, enclosed in quotes, but I don't see them in GitHub preview rendered by my browser.)

I got an error:

user-error: telega error=400: Strings must be encoded in UTF-8

Just to make sure the problematic string was communicated properly via GitHub:

ELISP> (sha1 "\"  \" \" \"")
"64fff0e100dac59d377f7745bddec970751c04fd"

I've specified utf-8 encoding by default everywhere in my OS and within Emacs. These chars are part of a standard Emacs output. In effect, having set up UTF-8 everywhere, I can't always kill text in an Emacs buffer and send it via telega, which sounds like a bug, given the error message. The same error was triggered when I quoted this string in a markdown code block.

Still, it's possible something is wrong on my side.

Steps to Reproduce

Trying to send a message like the one I quoted should be enough to trigger this.

zevlg commented 6 months ago

Could you please eval (mapcar 'identity <your-string>), github converted your special chars into regular spaces

akater commented 6 months ago
ELISP> (mapcar #'identity "\"  \" \" \"")
(34 55358 56708 34 32 34 129412 34)
zevlg commented 6 months ago

55358 and 56708 are not utf-8 chars, they are utf-16 surrogated pair coding 🦄 utf-8 char, use 🦄 instead of these two chars