apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.28k stars 3.47k forks source link

[R] Support inequality joins #29841

Open asfimport opened 2 years ago

asfimport commented 2 years ago

dplyr 1.1 supports inequality conditions in joins. We should explore what we can support.

Reporter: Neal Richardson / @nealrichardson

Related issues:

Note: This issue was originally created as ARROW-14264. Please see the migration documentation for further details.

wkumler commented 2 months ago

Just wanting to give a shout-out to this issue! Would love to see these implemented in Arrow because they're fairly core to my workflow. I do a lot of range searches on continuously valued data (very high cardinality) and my experiments with relational databases (SQLite, PostreSQL) imply that performing these as a non-equi join is consistently faster than looping or pasting the searches together with an OR clause. I don't know whether that would also be the case with Arrow but if so that would be huge since Arrow is already near-optimal for the workflow with the looping/pasting method.