Closed okennedy closed 4 years ago
@okennedy Create view statements were not parsing after merging this pull. I made a change to MimirSQL.scala to fix it. Was there a reason for having what I removed in this commit? commit: 6e91f3bdb36f50567b71be815e58533556ebdc1e
This branch adds a new
Operator
calledDrawSample
, which makes sampling a first-class primitive (as a step towards addressing #362). Currently support exists for two forms of sampling:Both of these sampling are probabilistic (you get approximately the specified %, rather than exactly that amount).
The choice to make DrawSample a distinct operator (rather than folding it into, say Select) was motivated by the following:
udf
method. It would be possible in Mimir if we had support for dictionary types... but we don't. Thus, the only way to implement this (without resorting to repeated string parsing in a tight loop) is directly in RAToSpark... which means we need something on the order of an Operator to pull it off.I would like to revisit this decision, ideally once we add a more robust typesystem into Mimir (e.g., #367).
A few more notes: