tkf / emacs-request

Request.el -- Easy HTTP request for Emacs Lisp
http://tkf.github.com/emacs-request/
GNU General Public License v3.0
629 stars 93 forks source link

Commit 56466cd breaks encoding #158

Closed thanhvg closed 5 years ago

thanhvg commented 5 years ago

OS: Ubuntu 18.04 Emacs 26.3 Request uses curl

The latest commit 56466cdc18275642547566bb51c257aa2155d07c breaks on html buffers.

For example, the snippet bellow will read https://news.ycombinator.com/front?day=2019-11-13 and put to a new buffer name temp:

(defvar temp-buf "temp")

(request "https://news.ycombinator.com/front?day=2019-11-13"
         :parser (lambda ()
                   (goto-char (point-min))
                   (buffer-substring (point-min) (point-max)))
         :success (cl-function (lambda (&key data &allow-other-keys)
                                 (switch-to-buffer-other-window temp-buf)
                                 (with-current-buffer temp-buf
                                   (erase-buffer)
                                   (insert data)))))

But on trying to save this temp buffer to file I got this error:


These default coding systems were tried to encode text
in the buffer ‘testme’:
  (utf-8 (8956 . 4194274) (8957 . 4194176) (8958 . 4194195) (11095 . 4194274)
  (11096 . 4194176) (11097 . 4194195) (18682 . 4194274) (18683 . 4194176) (18684
  . 4194201) (26228 . 4194274) (26229 . 4194176))
However, each of them encountered characters it couldn’t encode:
  utf-8 cannot encode these:     \342 \200 \233 ...

Click on a character (or switch to this window by ‘SPC w w’
and select the characters by RET) to jump to the place it appears,
where ‘SPC u C-x =’ will give information about it.

Select one of the safe coding systems listed below,
or cancel the writing with SPC w p p and edit the buffer
   to remove or modify the problematic characters,
or specify any other coding system (and risk losing
   the problematic characters).

  raw-text no-conversion

Since the buffer is not encoded properly, parser actions on it such as libxml-parse-html-region will fail.

If i revert this line https://github.com/tkf/emacs-request/blob/56466cdc18275642547566bb51c257aa2155d07c/request.el#L1074 back to

(set-process-coding-system proc encoding encoding)

It will work again. I don't know why, I just report my observation.

Thanks

dickmao commented 5 years ago

Sorry about this.

I can't seem to reproduce your MWE with ycombinator.com, but I also don't doubt it's happening. The utf-8 conversion has been highly problematic and every time I try to fix it for someone, I break it for someone else.

My change in https://github.com/tkf/emacs-request/commit/30851ddbc0afe591a4c3b968bfa5e6e006168edd#diff-912d9d4e16fd12f55900a2621903077eL1074 was admittedly cavalier and probably too aggressive. I will try to pare it back.

Edit: I decided that change wasn't bold enough. I removed the line altogether in #159 .

thanhvg commented 5 years ago

It works again with the latest version. Thanks so much for you quick action!