Netflix / iceberg

Iceberg is a table format for large, slow-moving tabular data
Apache License 2.0
472 stars 59 forks source link

(WIP) Add in and notIn predicates #81

Closed omervk closed 5 years ago

omervk commented 5 years ago

This is an ongoing work on implementing IN and NOT_IN at all levels, including the introduction of mutli-value comparison Predicates.

rdblue commented 5 years ago

Thanks for updating this, @omervk! I'll review it in the morning. I had a quick look and I see that evaluations are still O(n) implementations which I think we need to change.

rdblue commented 5 years ago

@omervk, if you want to continue working on this, please re-open it in the apache/incubator-iceberg repository. That's the project's new home. Thanks!

omervk commented 5 years ago

Thanks, Ryan! Yes, I would like to come back to this issue soon, but I've learned a lot during this PR and will probably start over in the new repo. Congrats on the move :)

aokolnychyi commented 5 years ago

Hey, @omervk. Would you be interested to continue this work and submit a PR in the new repo? This PR can give a significant performance benefit for some queries.

omervk commented 5 years ago

@aokolnychyi thanks! I would love to continue working on this, but can't promise anything timeline wise so it would be best to just let someone else work on it if they want to.

raveeram commented 5 years ago

@omervk @aokolnychyi I've started working on this. Please do let me know if one of you already are.

aokolnychyi commented 5 years ago

@raveeram I started looking into it but didn't have enough time. It would be great if you could take over @omervk's code and submit a new PR. I'll be happy to review/help.

raveeram commented 5 years ago

Sounds great, thanks @aokolnychyi!