deephaven / deephaven-core

Deephaven Community Core
Other
255 stars 81 forks source link

EPIC: Statelessness by default and better query optimization #4896

Open rcaudy opened 10 months ago

rcaudy commented 10 months ago

Historically, filters (Filter, WhereFilter) and columns (Selectable, SelectColumn) have been assumed to be stateful unless we could easily prove otherwise. This prevents decomposing evaluations into sub-evaluations, and it also prevents parallelization; either action can re-order evaluations in a way that violates a user's assumptions regarding state.

We should:

  1. Offer users a way to express that their evaluations are indeed stateful
  2. Change the default assumption from "evaluations are stateful" to "evaluations are stateless"
  3. Parallelize more, leveraging the new assumption
  4. Introduce expression decomposition into our parser and related stack
  5. Apply stateless filters to data indexes in where, and exclude re-applying stateless filters whose input columns were not modified
  6. Ensure that stateful filters act as a reordering barrier when applying filters (that is, filters may never be re-ordered to change order relative to a stateful filter).

    Compile-Latency Optimizations Worth Looking At:

    1. Don't compile formulas that are being replaced with static results. (e.g. A = NULL_INT)
    2. Consider re-using a formula for simple lambda expressions. (e.g. B = A.getId())
    3. Consider re-using a JavaFileManager when compiling multiple formulas at once. (See #4814)
rcaudy commented 9 months ago

We should be sure the end result addresses #4959 in a satisfactory way. "Side-effectful" formulas should not be parallelized.