Smarter data management

gkunter commented 8 years ago

Currently, the whole results data frame is processed by the data manager whenever anything in the processing pipe changes. These changes can be really small, e.g. simply removing a sorter on a column. This can lead to unexpected delays, in particular if the contexts have to be retrieved again.

Thus, a more clever way of recalculating the output object is needed. First of all, the GUI needs to be more alert of changes that actually require reprocessing. For example, reprocessing is triggered when the group filters are activated even if no group filter is actually specified.

Second, the processing pipe needs to be optimized. At the very least, the contexts need to be cached so that they don't need to be queried. Ideally, recalculation starts only at that point in the processing pipe at which the change occurs, but this means that every intermediate step in the pipe is stored somewhere.

gkunter commented 7 years ago

Now that the contexts are cached, this doesn't seem to be a pressing issue anymore.

gkunter commented 7 years ago

This still is an issue after all. For example, adding a function to a results table that was itself produced by an time-intensive aggregation causes the whole results table to be re-evaluated, as does changing a result filter.

gkunter / coquery

Smarter data management #195