aio-libs / aiokafka

asyncio client for kafka
http://aiokafka.readthedocs.io/
Apache License 2.0
1.17k stars 233 forks source link

Introduce "socket_keepalive_enable" configuration property #670

Open amotl opened 4 years ago

amotl commented 4 years ago

Dear Taras and Denis,

thank you so much for all of your answers within #665 and for picking up the work on aiokafka.

I would like to share a story with yours we had to handle just recently. It's a fun read, but obviously not so funny when running into the respective issues.

It is related to some networking issues people have been experiencing when running on Azure and I reported about all of the nitty gritty details within https://github.com/edenhill/librdkafka/issues/3109 already. Enjoy!

So, while I see that metadata_max_age_ms is already configurable in order to adjust to the recommended settings suitable when running on Azure (either when running vanilla Kafka or when connecting to Azure Event Hubs),

https://github.com/aio-libs/aiokafka/blob/48f5df54c90cdfd80d079c20bd9c05663cd43a7b/aiokafka/consumer/consumer.py#L111-L114

there should also be a way to configure socket_keepalive_enable, as outlined within at https://github.com/edenhill/librdkafka/issues/3109#issuecomment-714471123.

The recommended settings in the context of our observations on Azure would be

socket_keepalive_enable = true
metadata_max_age_ms = 180000

So, I am humbly asking to take that into consideration for the upcoming release.

With kind regards, Andreas.

amotl commented 4 years ago

In order to get a rough idea how that might be done in pure Python, you might want to check [1] as a reference, originally coming from https://github.com/crate/crate-python/pull/374 by @chaudum or [2] by @tyrande000.

The most basic thing to do would be like

if socket_keepalive_enable:
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

[1] https://github.com/crate/crate-python/blob/0.26.0/src/crate/client/http.py#L286-L310 [2] https://github.com/turingsec/marsnake/blob/799400d62677c2aa36d741c3154202503e60d71b/network/ksocket.py#L72-L83

tvoinarovskyi commented 4 years ago

Yes, seems like an option to provide custom socket flags would be great. We could also allow other values modified, such as buffer size or NO_DELAY option.

asvetlov commented 4 years ago

TCP_NODELAY is on by default for asyncio starting from Python 3.7 BTW

ods commented 4 years ago

BTW, kafka-python does support customisation via socket_options parameter: https://github.com/dpkp/kafka-python/blob/6fc008137c75c751a9fbea3e0ef36d2870119c7b/kafka/conn.py#L135-L137 https://github.com/dpkp/kafka-python/blob/6fc008137c75c751a9fbea3e0ef36d2870119c7b/kafka/conn.py#L373-L375

With asyncio this means we have to create socket ourselves instead of relying on what loop.create_connection() does.

TribuneX commented 1 year ago

Any update on this request? We are also affected by some Azure Event Hub connectivity issues (see: #624) and would like to adapt the settings as @amotl reported.