apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.9k stars 3.39k forks source link

[C++][Compute] Support to initialize expression with a string #32625

Open asfimport opened 1 year ago

asfimport commented 1 year ago

I want to achieve such a function, first would like to ask you which way to achieve better

 

For example, I want to initialize an expression whose content is

a210 - (a210 /203) * 203 = 0

This means that column A210 modulo 203 is equal to 0

 

How do you compare these two ideas?

 

"(subtract(a210, multiply(divide(a210, 203), 203)) == 0)" to Expression

or

"a210-(a210/203)*203==0" to Expression

Reporter: LinGeLin Assignee: Sasha Krassovsky / @save-buffer

PRs and other links:

Note: This issue was originally created as ARROW-17351. Please see the migration documentation for further details.

asfimport commented 1 year ago

Weston Pace / @westonpace: I think the first (more verbose) option is preferred because it will be more generic.

However, if the first option is working, the second option can always be added later as an optional shortcut (and then support both).

asfimport commented 1 year ago

Eduardo Ponce / @edponce: I have several comments with respect to string-to-Expression support:

  1. Favor explicit names for operations (via prefix notation) instead of algebraic expressions

    1. String expressions require defining an escape character and rules for handling this type of nuisances.

    2. There have been ideas to support a text format for defining unit tests' inputs. Ideally, a single text format would be practical for both cases, otherwise, it may not be adequate to pursue an Arrow custom text format.