tkf / emacs-request

Request.el -- Easy HTTP request for Emacs Lisp
http://tkf.github.com/emacs-request/
GNU General Public License v3.0
629 stars 93 forks source link

Apparent race condition with sync requests #92

Closed alphapapa closed 5 years ago

alphapapa commented 6 years ago

I have found that, somehow, it is possible to put request into a state in which synchronous curl requests return before the curl process has finished, so the request-response object has a data value of nil as soon as it returns.

For example, when request is in this state, a synchronous request returns immediately, and its data is nil:

(request-response-data (request "http://example.com" :sync t))  ;=> nil

But if I store the request object in a variable and then access its data, it works. Notice that the data element of the struct, the third element, is nil, but then when I access it with request-response-data after the setq returns, the data is available:

(setq argh (request "http://example.com" :sync t)) ;=> #s(request-response nil nil nil nil nil "http://example.com/" ...)
(request-response-data argh)                       ;=> data

But when I run another Emacs process and run this same code, it works correctly:

(request-response-data (request "http://example.com" :sync t)) ;=> data

Making this even more confusing, in the "broken" Emacs process, when I run the same code with edebug, everything works correctly. So when request is in this state, everything works correctly when run in edebug, but fails when run normally, which makes this very hard to debug. :(

I have not been able to figure out what happens to put request into this state. It might be connected with request-log-level, but even when setting that back to -1, it does not fix the problem.

Perhaps it's some kind of race condition?

Note: This library appears to be unmaintained now (Hello, @tkf?), so I'm filing this issue for future reference, as this is not the first time this has happened to me and wasted a few hours of time trying to debug it. Maybe someone else will encounter the same problem and be able to figure it out...

alphapapa commented 6 years ago

After much digging I've found that when using the synchronous curl backend, the complete callback (which is used to prevent returning from request--curl-sync until the curl command finishes) is not passed through the chain of calls. It appears to be because the settings field is not set on the response object in the request--curl function. Adding it in that function at this position seems like the natural thing to do:

    (setf (request-response--buffer response) buffer)
    (setf (request-response-settings response) settings)  ; <-
    (process-put proc :request-response response)

This correctly propagates the :success callback.

However, this problem seems to remain unfixed. Adding timestamps to request-log shows that request--curl-sync (which should not return until all the callbacks have completed) returns before the callbacks run, which means that the request has not yet been parsed and set on the response object. Here's an except from the log:

08:52:43:094: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:322: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:327: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:328: [debug] `request--curl-sync' done
08:52:43:328: [debug] `request' RETURNING: RESPONSE DATA: nil  ;; <- The (request ... :sync t) call returning before the callback, which is run on the next line.
08:52:43:352: [debug] REQUEST--CURL-CALLBACK event = finished

08:52:43:352: [debug] REQUEST--CURL-CALLBACK proc = #<process request curl>
08:52:43:352: [debug] REQUEST--CURL-CALLBACK buffer = #<buffer  *request curl*>
08:52:43:352: [debug] REQUEST--CURL-CALLBACK symbol-status = nil
alphapapa commented 6 years ago

Despite further digging I have been unable to figure out the problem. It is very strange, because when I start a new Emacs process, it works correctly at first, but at some point, it stops working correctly, and request--curl-sync returns before the callbacks have run.

I haven't been able to figure out what the catalyst is. Sometimes it seems to start happening after a period of time has passed. Other times it seems to happen immediately, even the first time I make a request.

  (let ((result))
    (mapatoms (lambda (atom)
                (when (string-prefix-p "request-" (symbol-name atom))
                  (push (cond ((ignore-errors (symbol-value atom))
                               (list (symbol-name atom) (s-replace "|" "--" (format "%S" (symbol-value atom)))))
                              (t (list (symbol-name atom) nil)))
                        result))))
    result)

It's the same code running each time, but after some event happens, the callbacks start running at the wrong time, and AFAICT it never works again correctly in that Emacs session.

However, let-binding request-backend to the url-retrieve backend makes the sync requests work correctly. So maybe the workaround is to simply use the url-retrieve backend for synchronous requests (that is, when curl isn't required for some reason).

alphapapa commented 6 years ago

For future reference, here's a patch with the debugging code I added:

diff --git a/emacs.d/elpa/request-20170131.1747/request.el b/emacs.d/elpa/request-20170131.1747/request.el
index 815745f..c6c084c 100644
--- a/emacs.d/elpa/request-20170131.1747/request.el
+++ b/emacs.d/elpa/request-20170131.1747/request.el
@@ -175,7 +175,7 @@ (defmacro request-log (level fmt &rest args)
          (log-level (request--log-level-as-int request-log-level))
          (msg-level (request--log-level-as-int request-message-level)))
      (when (<= level (max log-level msg-level))
-       (let ((msg (format "[%s] %s" ,level
+       (let ((msg (format "%s: [%s] %s" (format-time-string "%H:%M:%S:%3N") ,level
                           (condition-case err
                               (format ,fmt ,@args)
                             (error (format "
@@ -191,7 +191,7 @@ (defmacro request-log (level fmt &rest args)
                (goto-char (point-max))
                (insert msg "\n"))))
          (when (<= level msg-level)
-           (message "REQUEST %s" msg))))))
+           (message "%s: REQUEST %s" (format-time-string "%H:%M:%S:%3N") msg))))))

 
 ;;; HTTP specific utilities
@@ -579,6 +579,8 @@ (defun request-cookie-alist (host &optional localpart secure)
     (setf (request-response--timer response)
           (run-at-time timeout nil
                        #'request-response--timeout-callback response)))
+  ;; This is happening BEFORE the callback is called.
+  (request-log 'debug "`request' RETURNING: RESPONSE DATA: %s" (request-response-data response))
   response)

 (defun request--clean-header (response)
@@ -638,8 +640,7 @@ (defun request--parse-data (response parser)
 (cl-defun request--callback (buffer &key parser success error complete
                                     timeout status-code response
                                     &allow-other-keys)
-  (request-log 'debug "REQUEST--CALLBACK")
-  (request-log 'debug "(buffer-string) =\n%s"
+  (request-log 'debug "`request--callback' (buffer-string) =\n%s"
                (when (buffer-live-p buffer)
                  (with-current-buffer buffer (buffer-string))))

@@ -660,9 +661,9 @@ (defun request--parse-data (response parser)
     ;; Note: Try to do this even `error-thrown' is set.  For example,
     ;; timeout error can occur while downloading response body and
     ;; header is there in that case.
-
+    (request-log 'debug "`request--callback' RESPONSE DATA: %s" (request-response-data response))
     ;; Parse response body
-    (request-log 'debug "error-thrown = %S" error-thrown)
+    (request-log 'debug "`request--callback' error-thrown = %S" error-thrown)
     (condition-case err
         (request--parse-data response parser)
       (error
@@ -673,12 +674,13 @@ (defun request--parse-data (response parser)
          (setq error-thrown err)
          (request-log 'error "Error from parser %S: %S" parser err))))
     (kill-buffer buffer)
-    (request-log 'debug "data = %s" data)
+    (request-log 'debug "`request--callback' data = %s" (pp-to-string data))

     ;; Determine `symbol-status'
     (unless symbol-status
       (setq symbol-status (if error-thrown 'error 'success)))
-    (request-log 'debug "symbol-status = %s" symbol-status)
+    (request-log 'debug "`request--callback' symbol-status = %s" symbol-status)
+

     ;; Call callbacks
     (let ((args (list :data data
@@ -689,18 +691,20 @@ (defun request--parse-data (response parser)
              (cb (if success-p success error))
              (name (if success-p "success" "error")))
         (when cb
-          (request-log 'debug "Executing %s callback." name)
+          (request-log 'debug "`request--callback' Executing %s callback: %s.  DATA: %s" name success (request-response-data response))
           (request--safe-apply cb args)))

       (let ((cb (cdr (assq (request-response-status-code response)
                            status-code))))
         (when cb
-          (request-log 'debug "Executing status-code callback.")
+          (request-log 'debug "`request--callback' Executing status-code callback.")
           (request--safe-apply cb args)))

-      (when complete
-        (request-log 'debug "Executing complete callback.")
-        (request--safe-apply complete args)))
+      (if complete
+          (progn
+            (request-log 'debug "`request--callback' Executing complete callback.")
+            (request--safe-apply complete args))
+        (request-log 'debug "`request--callback' NO COMPLETE CALLBACK.")))

     (setq done-p t)

@@ -1039,7 +1043,8 @@          (default-directory (expand-file-name home-directory))
                     (apply #'request--curl-command url :files* files*
                            :response response settings)))
          (proc (apply #'start-file-process "request curl" buffer command)))
-    (request-log 'debug "Run: %s" (mapconcat 'identity command " "))
+    (request-log 'debug "`request--curl' Run: %s" (mapconcat 'identity command " "))
+    (request-log 'debug "`request--curl' Data: %s" (plist-get settings :data))
     (setf (request-response--buffer response) buffer)
     (process-put proc :request-response response)
     (set-process-coding-system proc 'binary 'binary)
@@ -1168,8 +1173,10 @@ (defun request--curl-callback (proc event)
                   :complete (lambda (&rest _) (setq finished t))
                   settings)
       (let ((proc (get-buffer-process (request-response--buffer response))))
-        (while (and (not finished) (request--process-live-p proc))
-          (accept-process-output proc))))))
+        (while (and (not finished) (process-live-p proc))
+          (request-log 'debug "`request--curl-sync' WAITING FOR CURL...")
+          (accept-process-output proc))
+        (request-log 'debug "`request--curl-sync' done")))))

 (defun request--curl-get-cookies (host localpart secure)
   (request--netscape-get-cookies (request--curl-cookie-jar)
dickmao commented 6 years ago

My reading of the code is that things are as they should be. The :sync t only ensures the curl process finishes before returning control, i.e.,

1171        (while (and (not finished) (process-live-p proc))

The request--curl-callback function only executes when emacs sees a change in process status.

1047    (set-process-sentinel proc #'request--curl-callback)))

That timing seems independent of whatever request.el is doing.

alphapapa commented 6 years ago

@dickmao No, things are not as they should be:

  1. (request url :sync t) behaves differently depending on which backend is used. That's a bug.
  2. The behavior in question usually does not happen the first time a request is made. At some point, something is mutated in the Emacs process, and the behavior changes from that point. That's a bug.
  3. The point of synchronous requests is that, when the request completes, the response data is available. Were this not the case, the calling code would have to spin, waiting for the response data to become available. In that case, there would be no reason to use synchronous requests in the first place. This problem essentially defeats the purpose of the feature; it's a bug.

Your analysis of the code may have found the problem, though. The :sync t code should not return until the response has been processed.

titaniumbones commented 6 years ago

@alphapapa is there a branch or forrk that fixes this problem? I'm using synchronous requests (because i din't understand how to do async code in elisp!) and seem to be encountering this bug. Thank you!

dickmao commented 6 years ago

If it's not a serious elisp project, I'd do something like:

(lexical-let (done-p)
  (with-local-quit  
    (request ... 
             :complete (lambda () setq done-p t)))
  (loop until done-p
        do (sleep-for 0 1000)))

Otherwise, you can follow the example in "Wrapping asynchronous function" in https://github.com/kiwanami/emacs-deferred . You don't need to fully understand what's going on upfront to get it working.

titaniumbones commented 6 years ago

@alphapapa would you be willing to prepare a PR from your patch? I can then try to test and merge. Thank you!

alphapapa commented 6 years ago

@titaniumbones The patch above doesn't fix the problem, it just adds debugging code. @dickmao's analysis may help pinpoint where to fix the problem. I don't expect I'll be able to work on this issue anytime soon. Thanks for taking the initiative.

titaniumbones commented 5 years ago

I'm wondering if a starting point might be to devise a test that will (almost) always catch this issue, but pass if we can figure out a fix?

alphapapa commented 5 years ago

Good luck. I wasn’t able to reliably reproduce it in hours of attempts. Sometimes it would happen immediately, sometimes much later, but once it happened, it happened until I restarted Emacs.

dickmao commented 5 years ago

My reading of the code is that things are as they should be. The :sync t only ensures the curl process finishes before returning control, i.e.,

1171        (while (and (not finished) (process-live-p proc))

Actually I'm wrong (and you're right) -- this should probably be an or instead of an and. I believe the while-loop exits when process-live-p becomes false -- the finished semaphore gets short shrift.

But my earlier comment about the timing of curl-callback being independent of whatever request.el is doing holds. I attempted a fix in #99 .

odanoburu commented 5 years ago

can anyone confirm this is solved in the development branch? I was having a problem where I couldn't signal the error which occurred with a sync request, but now it seems to work.

if this is indeed closed, can we also make a new release?

odanoburu commented 5 years ago

gentle bump