UBOdin / mimir

Data-ish exploration through SQL+Uncertainty
http://mimirdb.info
Apache License 2.0
27 stars 13 forks source link

Error list performance #328

Closed okennedy closed 5 years ago

okennedy commented 5 years ago

ReasonSet source queries have already had a determinism annotation pass --- the query basically returns the set of inputs where a determinism annotation is true. The expression-level determinism annotation pass in particular creates these gnarly disjunctions on some inputs. When expanding a ReasonSet, queries were previously being run in (the default) BestGuess mode, which meant that the gnarly disjunctions were being fed through a second round of determinism annotation (i.e., is the determinism annotation non-deterministic?), which made them even gnarlier, or worse would sit there consuming memory and CPU for a disturbingly long time.

Basic solution: ReasonSet now runs any and all queries that it needs to run in UnannotatedBestGuess mode (which skips the determinism annotation pass).

This triggered a bug in UnannotatedBestGuess, which wasn't properly doing RowId annotations, or a few other important steps in the compilation process. Those are fixed now too.

Overall, seems to be at least an OOM performance improvement on some queries.