taoensso / carmine

Redis client + message queue for Clojure
https://www.taoensso.com/carmine
Eclipse Public License 1.0
1.16k stars 132 forks source link

Indecipherable error when starting a message queue handler - broken redis error handling #87

Closed j0ni closed 8 years ago

j0ni commented 10 years ago

Hi

I'm perplexed by this and would appreciate your help deciphering the error below.

2014-Jun-03 17:39:08 +0200 oden INFO [oden.redis] - Starting redis queue worker: {:pool {}, :spec {:host "localhost", :port 6379}}
2014-Jun-03 17:39:08 +0200 oden INFO [taoensso.carmine.message-queue] - Message queue worker starting: oden
2014-Jun-03 17:26:23 +0200 oden ERROR [taoensso.carmine.message-queue] - Worker error! Will backoff & retry.
java.lang.UnsupportedOperationException: nth not supported on this type: ExceptionInfo
                                     clojure.lang.RT.nthFrom                  RT.java:  857
                                         clojure.lang.RT.nth                  RT.java:  807
                      taoensso.carmine.message-queue/handle1        message_queue.clj:  277
taoensso.carmine.message-queue.Worker/start-polling-loop!/fn        message_queue.clj:  358
   taoensso.carmine.message-queue.Worker/start-polling-loop!        message_queue.clj:  342
                 taoensso.carmine.message-queue.Worker/fn/fn        message_queue.clj:  370
                         clojure.core/binding-conveyor-fn/fn                 core.clj: 1910
                                       clojure.lang.AFn.call                 AFn.java:   18
                         java.util.concurrent.FutureTask.run          FutureTask.java:  266
           java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1142
          java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  617
                                        java.lang.Thread.run              Thread.java:  745

I've wrapped the handler setup and the handler function itself in try/catch code, like this:

(defn make-worker [{:keys [server-config thread-count qname worker]} handler]
  (try
    (let [handler (fn [{:keys [message]}]
                    (try
                      (let [message (edn/read-string message)
                            {:keys [metric timestamp rule-key]} message
                            result (handler metric timestamp rule-key)]
                        (if result {:status :success} {:status :error}))
                      (catch Throwable t (timbre/log :error (str "Handler error: " (.getMessage t))))))]
      (timbre/log :info (str "Starting redis queue worker: " server-config))
      (reset! worker (car-mq/worker server-config qname {:handler handler :nthreads thread-count})))
    (catch Throwable t (timbre/log :error (str "Failed to start handler: " (.getMessage t))))))

but I only see the error above. This all works in my local dev environment - this is in an ubuntu 12.04.2 VM. Same JDK version (1.8) and the same uberjar. I've verified that I can connect using redis-cli to localhost without any credentials, on the default port.

Any ideas?

j0ni commented 10 years ago

So a little investigation reveals that the handle1 function on line 277 of message_queue.clj is receiving an exception in place of poll-reply. I added some debugging and tested it locally.

2014-Jun-03 22:23:48 +0000 oden DEBUG [taoensso.carmine.message-queue] - Got poll-reply clojure.lang.ExceptionInfo: ERR unknown command 'EVALSHA' {:prefix :err}

I'm guessing this version of redis (Version: 2:2.2.12-1build1 on ubuntu 12.04.4 in this case) doesn't support that command. I'll see if I can find a more up-to-date version.

j0ni commented 10 years ago

Yeah, upgrading to redis 2.8.9 took care of this.

I guess this issue should now be re-titled "Redis error-handling broken in message queue" or something :)

ptaoussanis commented 10 years ago

Hi J,

Happy you got this sorted out. Thanks for the report (and sorry for the delay replying, have been travelling). Will take a look at improving the error message here, may need a couple weeks till I'm back from vacation.

Cheers! :-)

ptaoussanis commented 10 years ago

Just fixed on dev branch.

gsnewmark commented 8 years ago

@ptaoussanis Hi! So, is exception described in the initial post the intended behavior of the queue worker when Redis returns error string as a result of the Lua script?

Right now poll-reply here would be an ExceptionInfo in case Redis returns error for the command, because wcar doesn't throw errors when multiple commands are passed (related test, multiple command result parsing). For example, if Redis exhausts all allocated memory poll-reply would be an ExceptionInfo with following cause ERR Error running script (call to f_c09905edc7fdcc40f471cef5920114821719103b): -OOM command not allowed when used memory > 'maxmemory'., but handle1 would try to destructure it as vector leading to the initially described exception.

In any case, worker can't really work if memory is exhausted, but thrown exception could probably be a bit more helpful. Moreover, there're probably more cases when Redis returns error for the worker's Lua script, and showing actual Redis exception could really help in debugging the problem.

ptaoussanis commented 8 years ago

@gsnewmark Hi Ivan, thanks for the report. Will try take a look at this w/in the next week.

ptaoussanis commented 8 years ago

@gsnewmark Will be fixed with v2.13.0 - appreciate the detailed report, was helpful. Cheers :-)

gsnewmark commented 8 years ago

@ptaoussanis great, thanks!