fstpackage / fsttable

An interface to fast on-disk data tables stored with the fst format
GNU Affero General Public License v3.0
27 stars 4 forks source link

A remote table implementation and the physical data could be separated by a low-speed medium #27

Open MarcusKlik opened 6 years ago

MarcusKlik commented 6 years ago

For example, we can have a fst_remote package that implements the fst format as a remote table. That structure could be easily modified to have the fst_remote package running on a different computer than the table proxy (which would be running on the client side).

In effect, that would create a remote database that's serving data from fst files (accessible with a data.table interface) . Therefore, it might be good design to keep the in-memory structures in the table proxy class as small as possible. For example, when a row mask is calculated from an i expression, that structure could be saved to disk instead of returned to memory. Subsequent operations that need the mask would need to read it from file before it can be used. This avoids sending data back- and forth between the table proxy and the remote table implementation until absolutely necessary (and that helps for speed when working over a network connection for example). Also, it would be in line with the fsttable philosophy to keep the memory requirements as small as possible.