Open Eli-Zaretskii opened 1 year ago
Eli-Zaretskii @.***> writes:
So my suggestion is to replace utf-8-auto with utf-8. The latter actually does decode the EOL as you'd expect, and is what you usually want.
There is a comment in async.el from Stefan suggesting to use utf-8-unix:
;; FIXME: Why use `utf-8-auto' instead of `utf-8-unix'? This is
;; a communication channel over which we have complete control,
;; so we get to choose exactly which encoding and EOL we use, isn't it?
So what should be used here, utf-8 or utf-8-unix? John?
Thanks Eli to look into this.
-- Thierry
If this is used to communicate between two instances of async.el, then I recommend utf-8-emacs-unix
. That is the encoding used by Emacs internally, and it can represent any character that Emacs is capable of processing.
Ok, thanks, so I will use utf-8-emacs-unix
as you recommend. @jwiegley let me know what you think about this.
I agree with @Eli-Zaretskii.
@Eli-Zaretskii Just hijacking this thread, but am I correct in understanding that utf-8-auto
now detects EOL as well as BOM?
Going by describe-coding-system
on Emacs 29.1:
U -- utf-8-auto
UTF-8 (auto-detect signature (BOM))
Type: utf-8 (UTF-8: Emacs internal multibyte form)
EOL type: Automatic selection from:
[utf-8-auto-unix utf-8-auto-dos utf-8-auto-mac]
I hear you that it will insert BOM on write, so it would be a pretty bad coding system for write. But if you only use it to read e.g. files from both Linux and Mac workstations (some of which somehow have a BOM), but not write anything, it sounds okay.
am I correct in understanding that utf-8-auto now detects EOL as well as BOM?
Yes. It always detected EOL, btw. The fix in Emacs 29 was to correct the handling of BOM.
I'm told that
async.el
uses theutf-8-auto
coding-system to encode stuff, assuming that the "auto" part means this coding-system handles the end-of-line (EOL) format automagically.This is a mistake. Please read the doc string of
utf-8-auto
, and you will see that tha "auto" part is about the BOM, not about the EOL format. Moreover, on encodingutf-8-auto
always produces a BOM, something that many Lisp (and non-Lisp) programs don't expect at all. (Due to a bug,utf-8-auto
was not producing a BOM on encoding until now, but Emacs 29 fixes that bug.)So my suggestion is to replace
utf-8-auto
withutf-8
. The latter actually does decode the EOL as you'd expect, and is what you usually want.