orthecreedence / drakma-async

An asynchronous version of drakma that runs off of cl-async
38 stars 5 forks source link

With catch-app-errors, an error below das:http-request is silently captured #12

Closed orivej closed 11 years ago

orivej commented 11 years ago

I am not sure whether the problem is in das, asf, or my code, but it seems wrong that this finishes without error:

(defun test-error ()
  (error 'division-by-zero))

(defun test-case ()
  (as:with-event-loop (:catch-app-errors t)
    (asf:attach (das:http-request "http://iki.fi/")
                #'test-error))))
orthecreedence commented 11 years ago

So by virtue of the fact that you're using a library which plugs into the futures "platform," you have to be a bit mindful of errors. What's happening here is drakma-async is catching errors that happen in its callback and re-signalling them on the future it returns.

Here's how you'd catch errors when futures are involved:

(defun test-case ()
  (as:with-event-loop (:catch-app-errors t)
    (future-handler-case
      (attach (das:http-request "http://iki.fi/") #'test-error)
      (t (e) (format t "err: ~a~%" e)))))

So future-handler-case acts like handler-case but deals with async/futures a bit better. Keep in mind that it only operates on lexical scope (see the doc link I posted for more details).

In your case, the actual error would be some sort of argument mismatch: the callback given to attach must take the same number of arguments that the future finishes with (or &rest args).

orivej commented 11 years ago

Thank you. But why does event loop not catch the error resignalled from future-handler case? E.g. here the error is not printed, but escapes to the top level:

(defun print-error (e)
  (format t "err: ~a~%" e))

(defun test-case ()
  (as:with-event-loop (:catch-app-errors t :default-event-cb #'print-error)
    (asf:future-handler-case
      (asf:attach (das:http-request "http://iki.fi/") #'test-error))))
orthecreedence commented 11 years ago

Ok, it took me a bit but I figured this out.

What's happening is that http-request finishes its future with a number of values. Those values get passed to test-error which throws a "wrong number of args" error. This propagates up (inside http-request) until the first available event-cb is available. This event-cb is the one attached to the call to http-request-complete-stream in http-request, and it just forwards the event (error) to the future that's returned by http-request by calling signal-error on it.

Meanwhile, future-handler-case rewrote the attach call such that if signal-error is called on the future returned by http-request (which it is) then it will get passed to the future-handler-case form.

future-handler-case, when it catches an error, uses handler-case internally to catch errors by signalling the error received within the body of the handler-case:

(handler-case
  (error caught-error)
  [handler forms...])

Because you don't have any handler forms, the error (rethrown inside the handler-case) passes straight though the future-handler-case and into the REPL. It goes to the REPL instead of print-error because once the error is triggered, the future-handler-case form that catches it is one stack call below the REPL...there are no more handlers to send the error to the default event callback.

So while it all gets confusing, I think this is working as intended (and probably as best as it can be built). :default-event-cb is used in cases where an event-cb isn't passed directly to an async function which requires an event handler. It's not so much a top-level, catch-everything handler as much as a "this function needs event handling but you didn't specify a function to do it with, so use the default-event-cb." I could tie the :default-event-cb to be called when there are no handlers in future-handler-case but that would create a coupling between cl-async and cl-async-future that doesn't sit well with me. I think it's best to catch any errors you need to catch in the handler forms of future-handler-case.

The docs might be confusing here, I probably need to update the section on error handling to make this all a bit clearer. Please let me know if you have any questions or suggestions or anything.

orivej commented 11 years ago

After a little bit of practice, with proper handling of all errors, it feels pleasant to use your libraries. Actually, I'm building a replacement for the good old CL GitHub feed (http://planet.lisp.org/github.atom), which became defunct due to changes in GitHub. Now my worker is catching up on the repositories created since then. (GitHub API provides only for exhaustive scan of all repositories for the info on the language; there is a Web search query which returns the right number of repositories so far, but has its own drawbacks: https://github.com/search?l=Common+Lisp&q=%22%22.)

I noticed two oddities in the libraries. First, as:event-error does not inherit from error. Second, asf lacks its version of as:delay:

(defmacro asf:delay ((&key time) &body body)
  (with-gensyms (future)
    `(let ((,future (asf:make-future)))
       (as:delay (lambda () (asf:finish ,future (progn ,@body))) :time ,time)
       ,future)))
orthecreedence commented 11 years ago

Thanks, I'm glad you like the libs!

So are you scraping that github search page? Scraping and crawling is one place async stuff really shines. Are you replacing the old CL github feed or will there be a new URL posted once you're done? I'm interested to know your progress!

Also, I think the fact that as:event-error doesn't extend error is left over from back in cl-async's early days when there wasn't a concept of an error, just an event. When I changed it to split an event into "info" and "error" I tried to preserve some of the inheritance structures...from what I remember, I may have run into problems with event-error extending both event-info and error. I may need to revisit this. I have some sufficiently complex applications built on cl-async that can tell me in a few minutes whether or not it's a good idea to change this =]. Note that I've added a github issue for this: https://github.com/orthecreedence/cl-async/issues/83.

As far as the delay macro, yeah it's a pretty common pattern. I have done this countless times myself =]. However because it's not providing direct futures functionality (and also because it ties the futures implementation to cl-async too closely since right now they can both be run completely independent from each other) it will most likely be kept out.

orivej commented 11 years ago

Oh, with futures being a subsection in the cl-async documentation, and being named cl-async-futures, I did not notice they were in fact independent from cl-async. The section "Integration with cl-async" explains why futures are not integrated into cl-async, i.e. why cl-async is not implemented in terms of futures; not why they are not integrated with cl-async. However, I see some beauty in this independence.

I subscribed to the issue you opened; I think you meant to title it "Extend event-error class from error".

No, GitHub API is a dedicated service, separate from their HTML frontend. I'm scraping https://api.github.com/repositories and then individual repository pages. This amounts to ~700 000 requests to catch up with GitHub since the last entry of CL GitHub. GitHub allows everyone 5 000 requests per hour, so the whole process takes 6 days, and I'm 4 days in.

Obviously, scraping the search page would be less resource intensive, but I already had an implementation using GitHub API (synchronously) for a corporate GitHub, and I did not remember the way to trick search system intro returning all repositories for a particular language until the previous reply here. And as a benefit, I have data on all the languages. Most of them have traffic lower or around that of Common Lisp, which marks them as targets as suitable to follow.

Whether to replace the old feed will be for Zach to decide, but I am certain to give some options as to what to subscribe for elsewhere too.

orivej commented 11 years ago

Just to let you know, so far I have twice encountered [err] bufferevent.c:611: Assertion bufev_private->refcnt > 0 failed in _bufferevent_decref_and_unlock leading to termination of SBCL, but it runs under a virtualized Ubuntu 12.04, which makes it more likely to be some layers below libevent.

orthecreedence commented 11 years ago

The cl-async/cl-async-future docs need to be split up, but I am too lazy/busy to do it for a bit. Futures used to be a part of cl-async, but I got a few requests to split them up because the futures could conceivably be used in any async system, not just cl-async. Instead of moving the docs, I just left them and liked to them from cl-async-future's README =].

Thanks for the rundown feed, interesting stuff.

orthecreedence commented 11 years ago

Can you open a separate issue for that in cl-async? Also, any details you can give (how many outbound reqs you're doing, 32bit/64bit, SBCL version, etc) would be really helpful.