nnicandro / emacs-zmq

Emacs bindings to ØMQ
GNU General Public License v2.0
49 stars 18 forks source link

JSON parsing error in `zmq--download-module` #12

Closed jackkamm closed 5 years ago

jackkamm commented 5 years ago

When trying to run tests/compilation for emacs-jupyter I started encountering an End of file while parsing JSON error. On further investigation this happens when zmq--download-module function is called.

Here's an example:

jack@jaheira:~/src/emacs-zmq (git)-[master]- % make test
emacs -nw -Q -batch -L . -l ert -l zmq-tests.el \
        --eval "(ert-run-tests-batch-and-exit)"
Checking for compatible module binary to download
Contacting host: api.github.com:443
End of file while parsing JSON
make: *** [Makefile:64: test] Error 255

Tracking the issue further, the error happens because zmq--download-url inserts an empty string into the buffer instead of the expected JSON.

The problem seems to be with url-retrieve-synchronously. In particular, running the following prints the empty string "" on my system:

(setq url-debug t)
(with-temp-buffer
  (url-insert
   (url-retrieve-synchronously
    "https://api.github.com/repos/dzop/emacs-zmq/releases/tags/v0.10.9"))
  (print (buffer-string)))

The *URL-DEBUG* log buffer shows:

http -> Contacting host: api.github.com:443
http -> Marking connection as busy: api.github.com:443 #<process api.github.com>
http -> Request is: 
GET /repos/dzop/emacs-zmq/releases/tags/v0.10.9 HTTP/1.1
MIME-Version: 1.0
Connection: keep-alive
Extension: Security/Digest Security/SSL
Host: api.github.com
Accept-encoding: gzip
Accept: */*
User-Agent: URL/Emacs Emacs/26.2 (X11; x86_64-pc-linux-gnu)
Cookie: logged_in=no

retrieval -> Spinning in url-retrieve-synchronously: nil (#<buffer  *http api.github.com:443*>)
http -> url-http-end-of-document-sentinel in buffer ( *http api.github.com:443*)
http -> Contacting host: api.github.com:443
http -> Marking connection as busy: api.github.com:443 #<process api.github.com>
http -> Request is: 
GET /repos/dzop/emacs-zmq/releases/tags/v0.10.9 HTTP/1.1
MIME-Version: 1.0
Connection: keep-alive
Extension: Security/Digest Security/SSL
Host: api.github.com
Accept-encoding: gzip
Accept: */*
User-Agent: URL/Emacs Emacs/26.2 (X11; x86_64-pc-linux-gnu)
Cookie: logged_in=no

retrieval -> Spinning in url-retrieve-synchronously: nil (#<buffer  *http api.github.com:443*>)
http -> url-http-end-of-document-sentinel in buffer ( *http api.github.com:443*)
http -> Marking connection as free: api.github.com:443 #<process api.github.com>
http -> Activating callback in buffer ( *http api.github.com:443*): #[128 "\302\303\304p#\210\300\305\240\210\301p\240\207" [(nil) (#<buffer  *http api.github.com:443*>) url-debug retrieval "Synchronous fetching done (%S)" t] 5 "

(fn &rest IGNORED)"] (nil)
retrieval -> Synchronous fetching done (#<buffer  *http api.github.com:443*>)

The strange part is that, if I call url-retrieve as follows, I get the expected result:

(url-retrieve
 "https://api.github.com/repos/dzop/emacs-zmq/releases/tags/v0.10.9"
 (lambda (status) (print (buffer-string))))

Then, the even stranger part: after calling url-retrieve once, subsequent calls to url-retrieve-synchronously suddenly start to work correctly! Here is the *URL-DEBUG* log for one of these subsequent calls:

http -> Found existing connection: api.github.com:443 #<process api.github.com>
http -> Reusing existing connection: api.github.com:443
http -> Marking connection as busy: api.github.com:443 #<process api.github.com>
http -> Request is: 
GET /repos/dzop/emacs-zmq/releases/tags/v0.10.9 HTTP/1.1
MIME-Version: 1.0
Connection: keep-alive
Extension: Security/Digest Security/SSL
Host: api.github.com
Accept-encoding: gzip
Accept: */*
User-Agent: URL/Emacs Emacs/26.2 (X11; x86_64-pc-linux-gnu)
Cookie: logged_in=no

retrieval -> Spinning in url-retrieve-synchronously: nil (#<buffer  *http api.github.com:443*-689991>)
http -> Calling after change function `url-http-wait-for-headers-change-function' for `#<process api.github.com>'
http -> url-http-wait-for-headers-change-function ( *http api.github.com:443*-689991)
http -> Saw end of headers... ( *http api.github.com:443*-689991)
http -> url-http-parse-response called in ( *http api.github.com:443*-689991)
http -> Saw chunked encoding.
http -> Calling initial chunked-encoding for extra data at end of headers
http -> Reading chunk 0 (1036 1346 310)
http -> Saw start of chunk 1 (length=1112, start=1037
http -> Reading chunk 1 (1036 1346 310)
http -> Still need 803 bytes to hit end of chunk
http -> Still spinning for next chunk...
http -> Spinning waiting for headers...
http -> Calling after change function `url-http-chunked-encoding-after-change-function' for `#<process api.github.com>'
http -> Reading chunk 1 (1341 2156 815)
http -> Got to the end of chunk #1!
http -> Saw start of chunk 2 (length=0, start=2149
http -> Saw end of stream chunk!
http -> Removing terminator of last chunk
http -> Marking connection as free: api.github.com:443 #<process api.github.com>
http -> url-http-parse-headers called in ( *http api.github.com:443*-689991)
http -> url-http-parse-response called in ( *http api.github.com:443*-689991)
http -> Parsed HTTP headers: class=2 status=200
http -> Finished parsing HTTP headers: t
http -> Marking connection as free: api.github.com:443 #<process api.github.com>
http -> Activating callback in buffer ( *http api.github.com:443*-689991): #[128 "\302\303\304p#\210\300\305\240\210\301p\240\207" [(nil) (#<buffer  *http api.github.com:443*-689991>) url-debug retrieval "Synchronous fetching done (%S)" t] 5 "

(fn &rest IGNORED)"] ((:peer (:certificate (:version 3 :serial-number "04:30:d9:79:1f:fc:b5:20:ac:33:9f:a4:a7:3a:7b:76" :issuer "C=US,O=DigiCert Inc,OU=www.digicert.com,CN=DigiCert SHA2 High Assurance Server CA" :valid-from "2018-06-19" :valid-to "2019-07-10" :subject "C=US,ST=California,L=San Francisco,O=GitHub\\, Inc.,CN=*.github.com" :public-key-algorithm "RSA" :certificate-security-level "Medium" :signature-algorithm "RSA-SHA256" :public-key-id "sha1:4e:0a:00:36:1e:9d:51:52:3b:c2:f2:92:89:4f:6e:8b:56:22:15:e6" :certificate-id "sha1:5f:f1:60:31:09:04:3e:f2:90:d2:b0:8a:50:38:04:e8:37:9f:bc:76") :key-exchange "ECDHE-RSA" :protocol "TLS1.3" :cipher "AES-128-GCM" :mac "AEAD")))
retrieval -> Synchronous fetching done (#<buffer  *http api.github.com:443*-689991>)

I'm using emacs 26.2 on Archlinux.

nnicandro commented 5 years ago

Hmmm...that is weird. Looking at url-http-end-of-document-sentinel, it could be that the server is closing the connection on us. Can you change the url to http and see if that goes through more easily.

Either way, it would be a good idea to handle failures instead of expecting success.

jackkamm commented 5 years ago

Changing the url to http doesn't fix it -- I still get the empty string "".

jackkamm commented 5 years ago

I did a git bisect, and the error (or related ones) start appearing in eb96b39696229424efc2a5ffdaf59c55f1ae931d, when zmq--download-module was introduced. Before that commit, make test runs fine for me.

On a related note -- copying over an older, installed copy of emacs-zmq from my .emacs.d/elpa to the .cask directory allowed me to get tests working for emacs-jupyter.

nnicandro commented 5 years ago

I just did a cask update and the download went through on OS X and it works on Travis most of the time. On Travis there is never an error, but sometimes it just doesn't download anything and builds the module instead. I have seen this error when testing emacs-jupyter on Appveyor but I didn't bother looking into it much and started downloading the module manually when testing on Windows, it may be the same problem.

Are you trying the url-retrieve-synchronously call multiple times and it fails each time until you call url-retrieve? url-retrieve is used internally by url-retrieve-synchronously so I'm thinking it has something to do with how url-retrieve handles bad connections. Can you try downloading something else, e.g. https://ftp.gnu.org/gnu/emacs/emacs-lisp-intro-2.04.tar.gz, and see if you get the same problem.

I think what we can do is check for common programs for downloading files like curl or wget, use those for downloading, and fall back to url-retrieve-synchronously if needed. Then if we have to fallback to url-retrieve-synchronously, resort to building the module if we get a JSON parsing error.

jackkamm commented 5 years ago

Here's a summary of some URLs that do and don't work with url-retrieve-synchronously on my system:

#+begin_src emacs-lisp :results silent
  (setq url-debug t)

  ;; call url-retrieve-synchronously and print a substring of it
  (defun my-retrieve-synchronous (addr)
    (with-temp-buffer
       (url-insert
        (url-retrieve-synchronously addr))
       (buffer-substring-no-properties (point-min) (min (point-max) 100))))
#+end_src

Request to api.github.com returns nothing

#+begin_src emacs-lisp
  (my-retrieve-synchronous
   "https://api.github.com/repos/dzop/emacs-zmq/releases/tags/v0.10.9")
#+end_src

#+RESULTS:

Request to another https address works

#+begin_src emacs-lisp
  (my-retrieve-synchronous
   "https://ftp.gnu.org/gnu/emacs/emacs-lisp-intro-2.04.tar.gz")
#+end_src

#+RESULTS:
: \0{F<<kwj
: ]Qd;NW#'{l,m\0C"MwFM-3y-[+Qh

Request to api.github.com still returns nothing

#+begin_src emacs-lisp
  (my-retrieve-synchronous
   "https://api.github.com/repos/dzop/emacs-zmq/releases/tags/v0.10.9")
#+end_src

#+RESULTS:

Request to HTTP version still returns nothing

#+begin_src emacs-lisp
  (my-retrieve-synchronous
   "http://api.github.com/repos/dzop/emacs-zmq/releases/tags/v0.10.9")
#+end_src

#+RESULTS:

Request to a different URL on api.github.com still returns nothing

#+begin_src emacs-lisp
  (my-retrieve-synchronous
   "https://api.github.com/repos/conda/conda/releases/tags/4.6.14")
#+end_src

#+RESULTS:

Note I was having some trouble consistently reproducing the effect where calling url-retrieve once will allow subsequent url-retrieve-synchronously to work -- sometimes it works, sometimes it doesn't. The requests started giving me an API rate limit exceeded preventing me from further testing to figure out reproducible conditions.

nnicandro commented 5 years ago

One thing I noticed when comparing to your failed download to your successful download is the (:peer (:certificate ... stuff on the http -> Activating callback in buffer line in the successful download output. This is nil for the failed download. Maybe it has something to do with network security? Can you try setting network-security-level to low (https://www.gnu.org/software/emacs/manual/html_node/emacs/Network-Security.html) so we can rule that out.

jackkamm commented 5 years ago

Just tried this and unfortunately didn't work :/

nnicandro commented 5 years ago

This bug report makes me think it still has something to do with network security since it mentions a problem where only the HTTP headers are returned. Does the variable system-configuration-features contain GNUTLS on your system?

jackkamm commented 5 years ago

Yes it does.

nnicandro commented 5 years ago

OK, I figured out why Travis sometimes downloads the module and sometimes doesn't. Its because the API rate limit is hit on Travis sometimes. On Appveyor, its because Emacs on Windows doesn't come with built-in support to handle HTTPS links.

I've just pushed a change that tries to download the module using curl, then wget and falls back to url-retrieve as a last resort.

jackkamm commented 5 years ago

Thanks, it works now :)

jack@jaheira:~/src/emacs-zmq (git)-[master]- % make test
emacs -nw -Q -batch -L . -l ert -l zmq-tests.el \
    --eval "(ert-run-tests-batch-and-exit)"
Checking for compatible module binary to download
Downloading https://github.com/dzop/emacs-zmq/releases/download/v0.10.9/emacs-zmq-x86_64-linux-gnu.tar.gz
Verifying sha256 signature of emacs-zmq-x86_64-linux-gnu.tar.gz
uncompressing emacs-zmq-x86_64-linux-gnu.tar.gz...
uncompressing emacs-zmq-x86_64-linux-gnu.tar.gz...done
Parsing tar file...
Parsing tar file...done
Extracting emacs-zmq-x86_64-linux-gnu/
Extracting emacs-zmq-x86_64-linux-gnu/emacs-zmq.so
ZMQ Version: 4.3.1
Running 11 tests (2019-05-16 15:00:38-0700)
...