Open mmmries opened 2 years ago
This is an use-case that I do not want to support in PullConsumer as it is meant to be a super simple API serving say 60-70% use-cases. See #8. Your use case basically would much more benefit from batching messages, so you could then pass these to async_stream
or whatever else suits you. For this, I planned to recommend Broadway and to provide Broadway Producer and Consumer modules in this library. What do you think?
Picking up the thread of conversation from https://github.com/mmmries/jetstream/pull/43#issuecomment-1085724070
I did a handful of benchmarks to check on the performance and overhead of starting multiple PullConsumer
processes vs starting a single PullConsumer
that delegates the work to separate processes.
The results are summarized in this graph.
The blue line shows the messages received acknowledge per second when starting multiple PullConsumer processes and the orange line shows the messages per second when running on a slightly modified version with a single PullConsumer that starts each job in a separate process.
There's not a huge gap in performance between the two methods, so starting multiple PullConsumer's is certainly a viable option.
this is an use-case that I do not want to support in PullConsumer as it is meant to be a super simple API serving say 60-70% use-cases
@mkaput I think this would be a good topic for us to discuss on a call, but if you have any thoughts before we get schedules lined up, I would be happy to read and try to understand asynchronously.
In my working experience, I have mostly worked at companies that were setting up event architectures between multiple backend services. In every case we have had multiple copies of the app running in production (for redundancy/resiliency). So this would mean having a single PullConsumer
running per instance the beam and we would still need to deal with the fact that different instances would get only part of the message history.
And in each of my professional projects, we have wanted to avoid having a queue fall behind just because of a single slow message (maybe something waiting on an IO call), so we have always allowed some number (like 20 - 200) of parallel messages to be processed per instance of the application. This helps to keep each service up-to-date with the stream.
So for me, parallelism is the common-case and I would think of running only a single PullConsumer
as being niche. It sounds like you have the opposite experience? I would love to better understand that use-case.
@mkaput I've continued thinking about the issue of tracking state in the processes receiving messages, vs tracking it elsewhere. There are a few other minor reasons to prefer separate processes:
nack
on a single message or even letting a process crash without blocking other things from continuing on)But I think all of those issues are relatively minor, the big issue is whether we are primarily writing a library to allow individual processes to receive small amounts of messages and track state in memory very efficiently (ie GenServer model) or enabling deployments where you have multiple copies of the app running and you need to potentially keep up with a large volume of messages without blocking the stream.
@byu I would also love to get your take on which of these use-cases best fits your problem space.
Hi @mmmries I've run some more benchmarks based on yours but with average of 10 runs per batch size, so it's less likely that we had some weird spikes etc. added https://github.com/membraneframework/beamchmark, so it's possible to see whats is holding us back and also on two different machines and got these results with small message size:
Looking at Beamchmark data we've came to a conclusion that requesting Jetstream itself is a chokepoint here and that might explain why in larger batch sizes one PullConsumer is faster as at the start as we send one request instead of i.e. 128
But also to check for larger size of messages I've mocked larger message(~6kB)
Then with example message from our system(~1kB)
So for me it feels like it really depends on use case and think that we should somehow support both.
Please let me know if I've missed something that would make this measurements incorrect.
If you would like to check out the code I've forked the repo https://github.com/marmor157/jetstream_benchmarks/pull/1
It looks like JetStream has Batching built into its system and it makes sense to me to lean into the built-in batching mechanism as it will likely be the most optimized. It sounds cumbersome to me to require developers to create a separate Jetstream.PullConsumer
for each parallel message they want to consume. Best I can tell it looks like the Broadway PR takes care of the complex case with concurrency and batching and the existing Jetstream.PullConsumer
is good for simpler cases of processing one message at a time. I'm not super familiar with Broadway, but it makes sense to me to leverage it for the complex concurrency case because it already knows how to do that and won't require reinventing it ourselves.
@mmmries What are your thoughts about the Broadway
approach of handling parallel messages? We would like to finally make a release of this library and this discussion seems like the only thing that is blocking the release.
The current implementation is essentially "single-threaded" since it requests a single message and then wait for it to arrive, then we handle that message before sending back the
ACK
ornext_message
.My use-case at work would certainly benefit from the ability to specify a limit of how many messages to handle in parallel. Something like:
This would match the same option name from Task.async_stream in the standard library.