superstreamlabs / memphis

Memphis.dev is a highly scalable and effortless data streaming platform
https://docs.memphis.dev
Other
3.22k stars 218 forks source link

Feature: materialised views #344

Open gedw99 opened 1 year ago

gedw99 commented 1 year ago

Current behavior

Its great to have a surface for configurng nats pub sub, but it would be also good to have materialised views over the data.

Suggested solution

https://github.com/cashapp/pranadb is what i was thinking of.

Basically you can have a distributed database of materialized views.

Pranadb is built in golang but only works with kafka. The end user uses SQL to query the data.

See their example to get the idea. Its very simple. https://github.com/cashapp/pranadb/blob/main/docs/demo.md

Additional context

No response

Code of Conduct

Contributing Docs

yanivbh1 commented 1 year ago

Hey @gedw99, Thanks for the feature request! It definitely makes sense and it's in our roadmap for Q4. Our current approach would be to create a sliding window, made out of a join of stations and queryable with SQL. What do you think?

gedw99 commented 1 year ago

Hey @yanivbh1

thanks for considering this.

pranadb is continuous and distributed making a DB for each "station". SO i guess your might be thinking about doing the joins between stations. Pranadb seems to already have joins in its code base from my very brief search.

This however does refer to windowed joins. Perhaps thats what you mean ? https://github.com/cashapp/pranadb/issues/214

idanasulin2706 commented 1 year ago

Hi @gedw99, query the data based on SQL queries is absolutely one of our goals for this year. Perhaps we are not going to support PranaDB as the query engine of Memphis, we are still in the research process of this feature so we don't have the whole picture right now

gedw99 commented 1 year ago

seems data fusion is taking over the world for this situation at the moment with datalake on s3.