verdict-project / verdict

Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing
http://verdictdb.org
Apache License 2.0
248 stars 66 forks source link

Create Stratified Sampling #342

Closed qingzma closed 5 years ago

qingzma commented 5 years ago

Hi,

I am wondering if there exist any SQL examples to generate stratified samples. Currently, METHOD could only be (uniform | hash)

The paper claims the stratified samples are generated in a tricky way using two-pass approach, but could you please provide more details about how to generate the stratified sample?

Thanks,

barzan commented 5 years ago

I don't think the stratified samples are in the open-source version. But there's a plan on adding it to the open source version too, hopefully in a week or two.

qingzma commented 5 years ago

Thanks for the info.