taoensso / carmine

Redis client + message queue for Clojure
https://www.taoensso.com/carmine
Eclipse Public License 1.0
1.15k stars 130 forks source link

Exception in message queue #175

Closed uaalto closed 7 years ago

uaalto commented 8 years ago

I cannot find a relation of this with our code. Is it possible that is an internal error of Carmine, or looks more like an issue in the handler?

2016-06-28T06:58:15.367299+00:00 app[worker.1]:                                         java.lang.Thread.run              Thread.java:  745
2016-06-28T06:58:15.367300+00:00 app[worker.1]:           java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  617
2016-06-28T06:58:15.367300+00:00 app[worker.1]:            java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1142
2016-06-28T06:58:15.367301+00:00 app[worker.1]:                          java.util.concurrent.FutureTask.run          FutureTask.java:  266
2016-06-28T06:58:15.367302+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367303+00:00 app[worker.1]:                          clojure.core/binding-conveyor-fn/fn                 core.clj: 1938
2016-06-28T06:58:15.367303+00:00 app[worker.1]:                  taoensso.carmine.message-queue.Worker/fn/fn        message_queue.clj:  276
2016-06-28T06:58:15.367304+00:00 app[worker.1]:    taoensso.carmine.message-queue.Worker/start-polling-loop!        message_queue.clj:  243
2016-06-28T06:58:15.367305+00:00 app[worker.1]: taoensso.carmine.message-queue.Worker/start-polling-loop!/fn        message_queue.clj:  262
2016-06-28T06:58:15.367306+00:00 app[worker.1]:                       taoensso.carmine.message-queue/handle1        message_queue.clj:  213
2016-06-28T06:58:15.367307+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367307+00:00 app[worker.1]:                 taoensso.carmine.message-queue/handle1/error        message_queue.clj:  200
2016-06-28T06:58:15.367308+00:00 app[worker.1]:                                        taoensso.timbre/-log!               timbre.clj:  419
2016-06-28T06:58:15.367326+00:00 app[worker.1]:                                       clojure.core/reduce-kv                 core.clj: 6573
2016-06-28T06:58:15.367327+00:00 app[worker.1]:                                  clojure.core.protocols/fn/G            protocols.clj:  174
2016-06-28T06:58:15.367328+00:00 app[worker.1]:                                              clojure.core/fn                 core.clj: 6547
2016-06-28T06:58:15.367329+00:00 app[worker.1]:                                          clojure.core/reduce                 core.clj: 6545
2016-06-28T06:58:15.367329+00:00 app[worker.1]:                                  clojure.core.protocols/fn/G            protocols.clj:   13
2016-06-28T06:58:15.367330+00:00 app[worker.1]:                                    clojure.core.protocols/fn            protocols.clj:   75
2016-06-28T06:58:15.367331+00:00 app[worker.1]:                           clojure.core.protocols/iter-reduce            protocols.clj:   49
2016-06-28T06:58:15.367332+00:00 app[worker.1]:                                           clojure.core/fn/fn                 core.clj: 6557
2016-06-28T06:58:15.367333+00:00 app[worker.1]:                                     taoensso.timbre/-log!/fn               timbre.clj:  467
2016-06-28T06:58:15.367333+00:00 app[worker.1]:           taoensso.timbre.appenders.core/println-appender/fn                 core.clj:   54
2016-06-28T06:58:15.367334+00:00 app[worker.1]:                            taoensso.timbre/default-output-fn               timbre.clj:   37
2016-06-28T06:58:15.367335+00:00 app[worker.1]:                            taoensso.timbre/default-output-fn               timbre.clj:   46
2016-06-28T06:58:15.367335+00:00 app[worker.1]:                                           clojure.core/force                 core.clj:  746
2016-06-28T06:58:15.367336+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367336+00:00 app[worker.1]:                                  taoensso.timbre/-log!/fn/fn               timbre.clj:  440
2016-06-28T06:58:15.367337+00:00 app[worker.1]:                                 taoensso.timbre/-log!/msg-fn               timbre.clj:  392
2016-06-28T06:58:15.367338+00:00 app[worker.1]:                                           clojure.core/deref                 core.clj: 2228
2016-06-28T06:58:15.367338+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367339+00:00 app[worker.1]:                                     taoensso.timbre/-log!/fn               timbre.clj:  369
2016-06-28T06:58:15.367346+00:00 app[worker.1]:                                           clojure.core/deref                 core.clj: 2228
2016-06-28T06:58:15.367347+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367348+00:00 app[worker.1]:                                     taoensso.timbre/-log!/fn               timbre.clj:  366
2016-06-28T06:58:15.367348+00:00 app[worker.1]:                                           clojure.core/deref                 core.clj: 2228
2016-06-28T06:58:15.367349+00:00 app[worker.1]:                                                          ...
2016-06-28T06:58:15.367350+00:00 app[worker.1]:              taoensso.carmine.message-queue/handle1/error/fn        message_queue.clj:  202
2016-06-28T06:58:15.367351+00:00 app[worker.1]:                                         clojure.core/ex-info                 core.clj: 4617
2016-06-28T06:58:15.367352+00:00 app[worker.1]: clojure.lang.ExceptionInfo: :error handler response
2016-06-28T06:58:15.367352+00:00 app[worker.1]:
ptaoussanis commented 8 years ago

Your queue handler is returning an :error status at some point, which is triggering a logging call, which is malfunctioning for a reason that I can't make out from the info provided here.

Are you using any Timbre middleware?

Could you please try again with [com.taoensso/carmine "2.13.3-uaalto"]?

uaalto commented 8 years ago

Thanks for your reply @ptaoussanis !

Yes, I'm using Timbre too :)

But this error seems to be originated internally, right? The queue handler is indeed returning :error, which I use to signal that the handler didn't succeed but don't want to retry. Is that the correct use of it?

ptaoussanis commented 8 years ago

No problem. Are you using any Timbre middleware? I.e. have you defined any custom middleware functions for Timbre?

Your use looks correct, but we should get better info for debugging with [com.taoensso/carmine "2.13.3-uaalto"].

uaalto commented 8 years ago

Actually, sorry, I confused two separate projects. In this one I'm not using Timbre.

ptaoussanis commented 8 years ago

No problem. To clarify: you don't need to be using Timbre for the suggestion to try [com.taoensso/carmine "2.13.3-uaalto"] for extra debugging info.

uaalto commented 8 years ago

Hi Peter! The issue remains. Here's the latest log:

2016-06-29T11:32:34.013372+00:00 app[worker.1]: 16-06-29 11:32:33 f053c4b1-fbcd-43d5-aee6-cb9128b5603b ERROR [taoensso.carmine.message-queue:202] - Error handling tasks-purchase queue message: c32db1
6a-d985-42c8-8a87-a81318a7a9e3
2016-06-29T11:32:34.013393+00:00 app[worker.1]:                                         java.lang.Thread.run              Thread.java:  745
2016-06-29T11:32:34.013394+00:00 app[worker.1]:           java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  617
2016-06-29T11:32:34.013394+00:00 app[worker.1]:            java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1142
2016-06-29T11:32:34.013395+00:00 app[worker.1]:                          java.util.concurrent.FutureTask.run          FutureTask.java:  266
2016-06-29T11:32:34.013396+00:00 app[worker.1]:                                                          ...                               
2016-06-29T11:32:34.013397+00:00 app[worker.1]:                          clojure.core/binding-conveyor-fn/fn                 core.clj: 1938
2016-06-29T11:32:34.013397+00:00 app[worker.1]:                  taoensso.carmine.message-queue.Worker/fn/fn        message_queue.clj:  287
2016-06-29T11:32:34.013398+00:00 app[worker.1]:    taoensso.carmine.message-queue.Worker/start-polling-loop!        message_queue.clj:  254
2016-06-29T11:32:34.013398+00:00 app[worker.1]: taoensso.carmine.message-queue.Worker/start-polling-loop!/fn        message_queue.clj:  273
2016-06-29T11:32:34.013399+00:00 app[worker.1]:                       taoensso.carmine.message-queue/handle1        message_queue.clj:  224
2016-06-29T11:32:34.013400+00:00 app[worker.1]:                 taoensso.carmine.message-queue/handle1/error        message_queue.clj:  202
2016-06-29T11:32:34.013400+00:00 app[worker.1]:                                        taoensso.timbre/-log!               timbre.clj:  365
2016-06-29T11:32:34.013401+00:00 app[worker.1]:                                           clojure.core/deref                 core.clj: 2228
2016-06-29T11:32:34.013402+00:00 app[worker.1]:                                                          ...                               
2016-06-29T11:32:34.013402+00:00 app[worker.1]:              taoensso.carmine.message-queue/handle1/error/fn        message_queue.clj:  203
2016-06-29T11:32:34.013403+00:00 app[worker.1]:                                         clojure.core/ex-info                 core.clj: 4617
2016-06-29T11:32:34.013404+00:00 app[worker.1]: clojure.lang.ExceptionInfo: :error handler response
2016-06-29T11:32:34.013405+00:00 app[worker.1]:      attempt: 1
2016-06-29T11:32:34.013405+00:00 app[worker.1]:     mcontent: {:idempotency-key "130db567-32f9-41e4-a13d-20a9767b52ab",

REDACTED

}
2016-06-29T11:32:34.013410+00:00 app[worker.1]:          mid: "c32db16a-d985-42c8-8a87-a81318a7a9e3"
2016-06-29T11:32:34.013411+00:00 app[worker.1]:        qname: "tasks"
ptaoussanis commented 8 years ago

Is that the entire stack trace? It's not clear to me what exception's actually being thrown.

uaalto commented 8 years ago

I know, is not clear to me either. This is running in Heroku, could it be tampering or limiting the logs? The redacted part contains just a few keys, but they look completely ok.

uaalto commented 8 years ago

Just added a line I was missing above! It seems the error is here: https://github.com/ptaoussanis/carmine/blob/master/src/taoensso/carmine/message_queue.clj#L202

ptaoussanis commented 8 years ago

I'm not familiar with Heroku or its logging, sorry. Like I say, best I can tell from this info:

  1. Your message-queue handler is returning an :error status for these messages (normal behaviour)
  2. That triggers a Timbre errorf logging call (normal behaviour)
  3. The logging call is failing for some reason (a deref is failing, but it's not clear where or why)

The full stack trace should include info that'd help debug 3, but that doesn't seem to be included here. Not sure if it'd be available elsewhere.

Just have my hands full tonight, but I'll create a debug build for you tomorrow.

uaalto commented 8 years ago

More info: the redacted keys are all strings, except one integer.

ptaoussanis commented 8 years ago

Something you can try in the meantime that could be helpful:

Please just try a Timbre logging call elsewhere:

(require '[taoensso.timbre :as timbre])
(try
  (timbre/errorf "Debug call")
  (catch Throwable t t))

If the logging call completes successfully, this'll return nil. If it fails, it'll return the error.

Please get me:

  1. The error
  2. The version of Timbre that you're using
  3. The value of timbre/*config*

Thanks!

ptaoussanis commented 8 years ago

Any update on this?

uaalto commented 8 years ago

Hi Peter! Sorry, I've been terribly overwhelmed. I actually changed the code to finish the task as {:status :success}. It is of course kind of cheating, but it actually makes more sense in my use case.

I can't run test code in the server anymore because we've moved to production, so I had to to do my tests locally.

  1. That call returns nil.
  2. Timbre version: [com.taoensso/timbre "4.5.0"]
  3. Value of timbre/config:
{:level :debug, :ns-whitelist [], :ns-blacklist [], :middleware [], :timestamp-opts {:pattern "yy-MM-dd HH:mm:ss", :locale :jvm-default, :timezone :utc}, :output-fn #function[taoensso.timbre/default-output-fn], :appenders {:println {:enabled? true, :async? false, :min-level nil, :rate-limit nil, :output-fn :inherit, :fn #function[taoensso.timbre.appenders.core/println-appender/fn--15329]}}}
ptaoussanis commented 7 years ago

Hi Ulysses, sorry for the delay replying.

Think the best way of proceeding on this if you're still having problems (?) would be trying to produce some kind of reproducible example that I could look at and debug from my end.

ptaoussanis commented 7 years ago

Hi Ulysses, can this issue be closed?