robinhood / faust

Python Stream Processing
Other
6.72k stars 535 forks source link

GlobalTable not working with multiple workers #716

Open vinaygopalkrishnan opened 3 years ago

vinaygopalkrishnan commented 3 years ago

Hi,

Using a Global Table across 3 workers. The Global Table is used to store metadata used for joining in stream processing. The Table is supposed to have 300,000 keys. However, only 1 of the 3 workers has 300,000 keys. The other 2 workers stop reading keys after a certain point. The 2nd worker has 104688 keys and the 3rd worker has only 23,583 keys.

Here is the table definition:

test_changelog_topic = app.topic(f"test-changelog", partitions=1, compacting=True, deleting=False )

lookup_upc = app.GlobalTable('test', default=None, value_type=ApexUpc, partitions=1, changelog_topic=test_changelog_topic, extra_topic_configs= { "cleanup.policy": 'compact', }
)

bobh66 commented 3 years ago

You might want to try the fork at https://github.com/faust-streaming/faust

It has a lot of fixes that have not been applied to this project, which has been unmaintained for a while now.

vinaygopalkrishnan commented 3 years ago

Thanks for the update, will try that fork.