Closed CosmosAtlas closed 1 year ago
Sorry to have taken so long to respond - thank you for the high-quality bug report!
i'm trying to work out how to best handle this. As a starting point, it seems that the GBK encoding can't encode the •
character. In a shell on my Gentoo system, which uses a UTF-8 encoding for the locale:
$ echo '•' | iconv -f UTF-8 -t GBK
iconv: illegal input sequence at position 0
In Emacs, if i:
gbk.txt
.gbk-dos
, setting it with M-x set-buffer-file-coding-system
.M-x insert-char BULLET RET
that results in a *Warning*
buffer:
These default coding systems were tried to encode the following
problematic characters in the buffer ‘gbk.txt’:
Coding System Pos Codepoint Char
gbk-dos 1 #x2022 •
However, each of them encountered characters it couldn’t encode:
gbk-dos cannot encode these: •
Does this happen in your Emacs as well?
Next, are you able to add that bookmark, with that title, by using buku
directly from the Windows terminal / command prompt? E.g.:
buku --add https://brennan.io/2015/01/16/write-a-shell-in-c/
If that succeeds, does the •
character correctly appear in the output of a search? E.g.:
buku --sany brennan
Either way, can you please copy-and-paste the search output here?
Thanks for the detailed follow up! It inspired some new research direction and I have successfully solved the issue on my side, despite from another perspective. (details at the end of the post).
Yes, I experience the same when I try to save a file with BULLET
in GBK format.
When I run buku in cmd or pwsh, it works perfectly. Upon inspection via python -c "import sys;print(sys.stdout.encoding)"
it seems like within emacs it returns "gbk" (e.g., through shell), but in either shell directly ran on windows it returns "utf-8".
At this point I realized if I can explicitly ask python to use utf-8 within emacs, the issue will be resolved for me. After some searching, I discovered an environment variable PYTHONIOENCODING
. By setting this variable in emacs (or globally), I was able to make the aforementioned command return "utf-8" within emacs (also removing the requirements to set the cmdproxy encoding).
Now ebuku runs perfectly for me using UTF-8 encoding in Emacs on windows.
IDK why shell acts differently between emacs and the system environment. Not much information available online too. I guess I have a pass this time, but it will probably hunt me in the future for another issue.
Again thanks for the insightful reply!
I tried in Linux and MacOS, only Windows have the following issue.
Error I'm getting from debug log.
The bookmark in question is linked.
The related line of ebuku is line 741
Some details about my encoding related configuration
cmdproxy
set as(gbk-dos . gbk-dos)
(not setting this ebuku will output garbage text\387\567
etc.)If I delete the special char (which is "•", the symbol used between work and name), ebuku works fine. If I also set the experimental utf-8 setting on windows, ebuku also works fine. However, this is not ideal as it could break other software.
Just off my head here. I'm wondering if any of the following is possible.