BurntSushi / xsv

A fast CSV command line toolkit written in Rust.
The Unlicense
10.31k stars 317 forks source link

Numeric equality and comparisons #31

Open Armavica opened 8 years ago

Armavica commented 8 years ago

Currently, even when foo is detected as integer-valued, search -s foo 42 matches on the values of foo which contain 42 in their digits, instead of matching on the values 42, as one could possibly expect. A workaround is search -s foo '^42$' or even search -s foo '^0*42$' in case of leading zeros in the file. It could also be useful to search for fields greater or less than a given value. Would it make sense to implement these operations?

BurntSushi commented 8 years ago

I think something like that would be useful. I suspect the challenging aspect of this will be coming up with a simple way to expose the functionality, since it could easily become quite complex. I'm not quite sure how to do it.

Armavica commented 8 years ago

I was thinking of a nsearch command with an optional argument for equality (probably only allowed on integer fields) and -gt <num>, -ge <num>, -lt <num>, -le <num> options for comparisons.

ant6n commented 7 years ago

I think a more general --where expr flag could be more useful. One could start with very simple expressions, just a subset of the query language, like AND and OR, and some binary operators <>=. And always with the assumption that only expressions that can be done in a single streaming pass should be supported.

Ideally the flag could also be applied to select, so one could write

xsv select a,b --where a>4 xsv select a,b -w "a>4 AND b==0"

(it would be kind of pretty if one could omit the -- in front of where and make it look closer to an actual sql query).

BurntSushi commented 7 years ago

Adding an arbitrary expression evaluator is a much bigger task, and I'm not at all convinced that I want to maintain such a thing. It should be considered an issue distinct from this one.

ant6n commented 7 years ago

I agree. However, I'm proposing it may make sense to work towards that in terms of interface -- even if only very few expressions are supported. This way one can avoid redundancies in the interface, and expand capabilities without changing the interface (i.e. once there's a where flag, there are no more changes to the interface besides expanding expression support).

BurntSushi commented 7 years ago

I disagree. Removing or deprecating flags is easy. But so is feature creep when you start during arbitrary expressions. It should be a separate issue.