Parsely / pykafka

Apache Kafka client for Python; high-level & low-level consumer/producer, with great performance.
http://pykafka.readthedocs.org/
Apache License 2.0
1.12k stars 232 forks source link

Support rdkafka with gevent #446

Open thedrow opened 8 years ago

thedrow commented 8 years ago

As proven by libraries like psycopg2 it is possible to provide a way to integrate a C extension with gevent. This may or may not require cooperation on librdkafka's part but will provide a lot of value in terms of throughput and concurrency. Let's investigate on what's required in order to enable gevent support even with the C extension enabled.

emmettbutler commented 8 years ago

This is a good idea. I'm not personally familiar with how psycopg2 accomplishes its gevent support, so any information you can provide related to that is helpful.

thedrow commented 8 years ago

OMCache allows you to provide a method that invokes select (or poll or epoll or any other polling mechanism for that matter). Gevent provides a green friendly select method which allows context switches when gevent.select() is called. Psycopg2 has bindings to the libpq async API which allows you to check the state of the current connection. So whenever a network operation is about to occur libpq calls our wait callback we start polling for results and invoke the appropriate gevent method for reading from the file descriptor. Psycogreen implements that exactly. I don't think librdkafka provides us with such mechanisms. The only callback I encountered was for logging.