Blosc / bcolz

A columnar data container that can be compressed.
http://bcolz.blosc.org
959 stars 149 forks source link

Implement logical operators for carray objects #195

Open FrancescAlted opened 9 years ago

FrancescAlted commented 9 years ago

That would allow the next code to work correctly:

In [1]: import bcolz
In [2]: a = bcolz.arange(1e7)
In [3]: b = bcolz.arange(1e7)
In [4]: a == b
Out[4]: False

For example, equality operator could easily (and efficiently) implemented as:

In [5]: %time bcolz.eval("a != b").sum() == 0
CPU times: user 147 ms, sys: 32.8 ms, total: 180 ms
Wall time: 84.4 ms
Out[67]: True
FrancescAlted commented 9 years ago

Oh, on a second thought, I remember now that I deliberately left all the operators unimplemented for bcolz. The reason is that once 'a == b' and 'a - b' would work, people would expect '(a - b) == c' to work too, and this is way out of the scope, and it is much preferable that people use bcolz.eval() for that (or dask, if the former is not enough).

However, what we can do is to issue a 'NotImplementedError' on carray operations suggesting people to use .eval() for that. Opinions?

esc commented 9 years ago

I think implementing == and != might be ok. Not sure though. Can you explain why (a - b) == c is out of scope.

Also, raising NotImplementedError would be better than just silently defaulting to False.

esc commented 9 years ago

Also, how would == work for carrays that have the same data, but different rootdirs?