metarhia / globalstorage

Distributed Data Warehouse 🌍
https://metarhia.com
MIT License
60 stars 6 forks source link

Large datasets navigation #186

Open tshemsedinov opened 5 years ago

tshemsedinov commented 5 years ago

Sometimes query may return large datasets, for example gs.select({ category: 'Person' }) called from client will send request to the server, then spread requests to all servers storing mentioned category then start receiving data by chunks in lazy mode. So client-side cursor will receive first chunk and will generate on('data') event and first 100 records will be available at the client-side to be iterated by cursor. But we may not want to transfer more before user navigate to below grid position. Is it ok for ours gui rendering console if cursor will have no all data at once and data will arrive chunk by chunk pushing to dataset? Also how GUI can inform cursor to get next chunk? @aqrln

tshemsedinov commented 5 years ago

Here is an example how cursors may fetch data @aqrln I think we cant use just cursor.dataset: Array, we need something lige fetch(callback(err)) or to load all records to cursor and then it will be available in cursor.dataset or can use event to get by chunks on('data', callback)

aqrln commented 5 years ago

@tshemsedinov

Is it ok for ours gui rendering console if cursor will have no all data at once and data will arrive chunk by chunk pushing to dataset?

Yes, that's okay, as there's an event (on('data')) one can subscribe to and dispatch actions with new data chunks to the store.

Also how GUI can inform cursor to get next chunk?

A UI component will use an async action creator to dispatch a higher-order function to the store which will be intercepted by redux-thunk middleware and "converted" to a real action when the data becomes available, just like it requests any other data in Redux architecture.

tshemsedinov commented 5 years ago

@aqrln I forgot a link to mentioned example: https://github.com/metarhia/globalstorage/pull/194 Cursor and work in two modes: (1) dataset holder, (2) data transformation from parent cursor not holding datasets. So cursors are chained but we can materialize datasets at any step. I am going to add cursor.materialize() that will receive all data from parent cursor and save it it's own dataset. This allow us to minimize data copies in memory. Do we need to copy data again to redux store? Or we will not use cursor materialization and will collect aggregated/transformed data just in store?

aqrln commented 5 years ago

@tshemsedinov store is immutable and can only be changed with pure functions (reducers) in response to actions, so if something changes, we need to copy it again.

tshemsedinov commented 5 years ago

See https://github.com/metarhia/globalstorage/issues/195