restatedev / restate

Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
https://docs.restate.dev
Other
1.54k stars 36 forks source link

Update Datafusion data access to work in distributed setup #1806

Open tillrohrmann opened 2 months ago

tillrohrmann commented 2 months ago

Currently, we are using Datafusion to expose administrative information to the CLI and users of the psql integration. The Datafusion data access layer assumes that it has access to all partitions on a single node. This assumption will most likely no longer hold true in a distributed setup since a single node might only run a subset of the available partitions. Therefore, we need to change the Datafusion data access layer to be able to retrieve and fetch the required data from multiple nodes. It is important to note that we don't require strong consistency guarantees at this point in time (data from different partitions does not have to be consistent wrt each other). However, it would be great if we could ensure monotonic reads.

tillrohrmann commented 1 week ago

https://github.com/restatedev/restate/issues/1907 could be required to solve this issue.