UBOdin / mimir

Data-ish exploration through SQL+Uncertainty
http://mimirdb.info
Apache License 2.0
27 stars 13 forks source link

Sample operator #368

Closed okennedy closed 4 years ago

okennedy commented 4 years ago

This branch adds a new Operator called DrawSample, which makes sampling a first-class primitive (as a step towards addressing #362). Currently support exists for two forms of sampling:

Both of these sampling are probabilistic (you get approximately the specified %, rather than exactly that amount).

The choice to make DrawSample a distinct operator (rather than folding it into, say Select) was motivated by the following:

I would like to revisit this decision, ideally once we add a more robust typesystem into Mimir (e.g., #367).

A few more notes:

mrb24 commented 4 years ago

@okennedy Create view statements were not parsing after merging this pull. I made a change to MimirSQL.scala to fix it. Was there a reason for having what I removed in this commit? commit: 6e91f3bdb36f50567b71be815e58533556ebdc1e