robinhood / faust

Python Stream Processing
6.72k stars 535 forks source link

Time taken to send lots of messages is longer in faust than aiokafka #718

Open surculus12 opened 3 years ago

surculus12 commented 3 years ago


Unable to test on master because I get an error regarding unexpected kwargs in aiokafka during startup.

Steps to reproduce

Produce a large amount of messages like so:

import asyncio
import time

import faust

FAUST_BROKER_URL = ['kafka://k8s-c4-w1:30656',

app = faust.App('testing')

class Timeit(object):
    def __init__(self, msg):
        self.msg = msg
        self.start = time.time()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        print(f'{1000*(time.time()-self.start):,.3f}ms: {self.msg}')

def configure_from_settings(app, conf, **kwargs): = FAUST_BROKER_URL

test_topic = app.topic('testing', value_type=str)

async def testing():
    n = 5000
    with Timeit(f"Producing {n} messages"):
        await asyncio.gather(*[test_topic.send(value=str(i)) for i in range(n)])


Expected behavior

Expected time taken to be similar to that of aiokafka (list of (time_in_ms, num_msgs)):


Code for producing with aiokafka:

import time
import asyncio

from aiokafka import AIOKafkaProducer
import yappi

FAUST_BROKER_URL = ['kafka://k8s-c4-w1:30656',
KAFKA_BROKER_URL = ','.join([u.replace('kafka://', '') for u in FAUST_BROKER_URL])

loop = asyncio.get_event_loop()

class Timeit(object):
    def __init__(self, msg):
        self.msg = msg
        self.start = time.time()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        print(f'{1000*(time.time()-self.start):,.3f}ms: {self.msg}')

async def do(p: AIOKafkaProducer):
    n = 20000
    payloads = [str(i).encode('utf-8') for i in range(n)]

    with Timeit(f"Producing {n} messages"):
        await asyncio.gather(*[await p.send('testing', payload) for payload in payloads])

async def start():
    producer = AIOKafkaProducer(
        loop=loop, bootstrap_servers=KAFKA_BROKER_URL)
    # Get cluster layout and initial topic/partition leadership information
    await producer.start()
        # Produce message
        await do(producer)
        # Wait for all pending messages to be delivered or expire.
        await producer.stop()


Actual behavior

Example of number of messages and time taken:


It takes 4x the time to send 20k messages in Faust than aiokafka (and sometimes even decently longer). It's significantly cheaper in a service I manage to create and stop a aiokafka producer


bobh66 commented 3 years ago

This project appears to have been abandoned.

You might want to check out the fork of this project -

It has a bunch of fixes merged for problems that were in the base project.