skroutz / rafka

Kafka proxy with a simple API, speaking the Redis protocol
https://engineering.skroutz.gr/blog/kafka-rails-integration/
GNU General Public License v3.0
8 stars 0 forks source link

Increased memory usage with many short-lived producers #39

Open agis opened 7 years ago

agis commented 7 years ago

We have a pathological case where RSS can skyrocket: large number of short-lived producers.

Such a case is typical with some kind of forking client. resque for example, spawns a process per job and kills it after the job is done. Assuming a short-lived job that also produces 1 message to Rafka, we may end up with hundreds or even thousand of producers that are spawned only to produce a single message and die afterwards.

In the meanwhile, confluent-kafka-go producers are costly since each of them pre-allocates two 1M-buffers:

The situation gets even worse cause of https://github.com/golang/go/issues/16930.

Proposal

This could be fixed by re-architecting Rafka to have a N:M model (N=client producers, M=librdkafka producers), but that would require significant changes and would make Rafka usage more complex. We want to keep the 1:1 model if possible because it is simple.

However, we can remedy the issue in some ways:

We should also state in the README that Rafka, like librdkafka itself is optimized for few long-lived producers instead of a bursty usage patterns (ie. many short-lived producers).

agis commented 5 years ago

https://github.com/golang/go/issues/16930 is now closed. We should verify that rafka built with Go 1.13 no longer suffers from this issue, and close this.

Also relevant: https://github.com/golang/go/issues/30333.

agis commented 5 years ago

After testing with go tip (65ef999), the situation is pretty much the same as before. So I'm leaving this open as a known issue.