uwdata / mosaic

An extensible framework for linking databases and interactive views.
https://idl.uw.edu/mosaic
Other
633 stars 37 forks source link

Multi-tier Execution #389

Open domoritz opened 1 month ago

domoritz commented 1 month ago

Right now, Mosaic either executes queries in the browser or via remote requests. If the network connection has high latency, queries over Mosaic’s indexes can become too slow for analysis at the speed of thought. To overcome this issue, design and develop a hybrid/multi-tier execution for Mosaic where queries over Mosaic indexes can run locally even if the indexes have to be computed remotely because the data is too large. An extension of this project could automatically determine the most efficient distributed query plan similar to MotherDuck and VegaPlus.

derekperkins commented 1 month ago

I think this would be very useful in conjunction with #398 about multi-table support. Using a star schema, we would probably have some of our dimension tables running client-side, but the big data living server-side.

derekperkins commented 1 month ago

As an extension to this and with multi-db support in #399, I would hope that there is an ability to use different engines on the backend and the frontend. We're looking at using StarRocks on the backend, which supports multi-tiered storage from object store -> hot SSD, and hopefully from there to DuckDB WASM browser-side.