taoensso / carmine

Redis client + message queue for Clojure
https://www.taoensso.com/carmine
Eclipse Public License 1.0
1.16k stars 132 forks source link

`:nthreads 0` errors out in versions >=3.3.0 #315

Open adam-jijo-swym opened 2 days ago

adam-jijo-swym commented 2 days ago

Until version 3.2.0 nthreads was allowed to be 0, but from 3.3.0 trying to do the same leads to this error being thrown.

Execution error (ExceptionInfo) at taoensso.truss.impl/default-error-fn (impl.cljc:101).
Invariant failed at taoensso.encore[4909,10]: (pos-int? n-threads)

Example of valid 3.2.0 code that errors out now:

(mq/worker {:pool {}, :spec {:host "localhost", :port 6379, :password "wtv"}}
             "test-queue"
             {:nthreads 0})

Setting nthreads to 0 is useful as it allows disabling workers via config and therefore should be supported in the newer versions as well.

ptaoussanis commented 2 days ago

@adam-jijo-swym Hi Adam,

Setting nthreads to 0 is useful as it allows disabling workers via config and therefore should be supported in the newer versions as well.

Could you please describe this use case a little more? What use is a worker that has no threads? The thread count can't be updated later, so a zero-thread worker will never do anything.

If you just want to delay the worker starting until you want it to start - you can either postpone calling worker (e.g. wrapping it in a delay), or you can use {:autostart false} in the worker opts (not currently documented, but a stable feature).

adam-jijo-swym commented 2 days ago
{:worker-a {:nthreads 10}
 :worker-b {:nthreads 10}}

Our services pick up nthreads values from a config file like shown above. Zero-ing out the nthreads value allows us to disable message processing for certain queues without any code changes.

Another use case that we're looking into would be for horizontally scaling queues when necessary. Suppose messages in the queue handled by :worker-a are piling up, a new instance could be spun up with the nthreads value for :worker-b zeroed out to support the load.

I agree that the same use case can probably be satisfied by disabling :autostart, but it does make for better ergonomics if we're able to disable workers by zeroing out the nthreads value.

Unless I'm missing something, I don't see a reason why workers without threads couldn't be supported, especially since this was supported previously.

ptaoussanis commented 2 days ago

Our services pick up nthreads values from a config file like shown above. Zero-ing out the nthreads value allows us to disable message processing for certain queues without any code changes.

Would it work to include {:autostart false} in your config?

Suppose messages in the queue handled by :worker-a are piling up, a new instance could be spun up with the nthreads value for :worker-b zeroed out to support the load.

I'm not sure I follow this, sorry. So :worker-a is backing up, and starting a zero-thread :worker-b will help to support the load? I might be missing something, but that wouldn't help?

but it does make for better ergonomics if we're able to disable workers by zeroing out the nthreads value.

I'm not necessarily closed to the idea, but I would like there to be a clear case for why this'd be better ergonomically. I worry that the semantics for this just don't make sense.

What does it mean to "start" or "stop" something with zero threads? In contrast, it's logically sensible to think of something with n>=1 threads that hasn't yet been started (and that may be started and stopped later).

Unless I'm missing something, a worker with zero threads is just permanently useless since the thread count can't ever be adjusted after.

especially since this was supported previously.

The key question for me is if allowing zero threads makes sense or not. If it doesn't, then one could make the case that the previous support was misleading and prone to cause confusion or misuse.

adam-jijo-swym commented 2 days ago

Would it work to include {:autostart false} in your config?

Yeah it would work. I'm trying to avoid that since it would be another parameter to deal with.

I'm not sure I follow this, sorry. So :worker-a is backing up, and starting a zero-thread :worker-b will help to support the load? I might be missing something, but that wouldn't help?

Ah sorry, reading it back it's clear I didn't explain it well so let me try again. So assume that we have one machine running our service with the config I've shown above. The queue handled by worker-a suddenly faces a lot of messages coming in and the queue size grows rapidly. In such a situation we might want to support worker-a. One way to do this would be to start another machine running the same service with the same config. This would help in sharing the load. But since our goal with this second machine is only to support worker-a, we don't really need worker-b running on it and having worker-b running would be taking away compute better used by worker-a. Hence it would be better to zero out nthreads for worker-b in the second machine.

The reason I'm bringing this up is that we're planning on moving to k8s and having this functionality would better enable us to spin up new pods to just support queues under load. I can imagine the newly spun up pods having nthreads values as some function of the rate of growth of their respective queues and in that case it does make sense to have 0 threads when the queue isn't growing. It would be harder to support this "dynamically spawned workers"-type scenario using :autostart false.

What does it mean to "start" or "stop" something with zero threads?

Hmm yeah I agree that the semantic meaning of a zero thread worker is a bit shaky. But from a library user perspective (or at least mine) the nthreads value is a stand-in for how much compute should we dedicate to process this queue?. Looking at it that way, a zero thread worker does make sense.

ptaoussanis commented 2 days ago

The reason I'm bringing this up is that we're planning on moving to k8s and having this functionality would better enable us to spin up new pods to just support queues under load. I can imagine the newly spun up pods having nthreads values as some function of the rate of growth of their respective queues [...]

I appreciate the extra info on your use case, thanks.

And to clarify- I assume you'd want start and stop calls against zero-thread workers to just no-op (rather than throwing)?

Would be happy to see a PR that makes this change, and restores support for zero-thread workers 👍 Otherwise I'll do myself next time I'm on batched Carmine work.