the speed of bitwise_or seems off compared to when i test with riptable
on my computer, in riptable, the bitwise_or of 1million int32 takes 59 micoseconds
when i use the same code to speed it up for numpy (we want numpy as fast as riptable or faster...) that's where I am putting my energy..
I get 1.51ms... something is off, not sure what. I suspect something internal about numpy, but I am not sure.
It could be the hook messed it up? but then i tried without hooking and got the same slower speed.
more information on this -- it works fine for int8,16,and 64 -- it is int32 or uint32 is somehow different.
get called back for int8/16/64 as I can see from speeds below.
do not get called back for int32.
Also not sure why numpy takes 503us for int8 and 589us for int64 which is 4 times larger (should take about 4 times longer), but not sure we care since once properly taken over, this will be solved.
In [1]: import numpy as np
In [2]: a=np.arange(1_000_000, dtype=np.int8)
In [3]: %timeit np.bitwise_or(a,a, out=a)
503 µs ± 2.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [4]: a=np.arange(1_000_000, dtype=np.int16)
In [5]: %timeit np.bitwise_or(a,a, out=a)
510 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [6]: a=np.arange(1_000_000, dtype=np.int32)
In [7]: %timeit np.bitwise_or(a,a, out=a)
511 µs ± 7.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [8]: a=np.arange(1_000_000, dtype=np.int64)
In [9]: %timeit np.bitwise_or(a,a, out=a)
589 µs ± 34.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [10]: a=np.arange(1_000_000, dtype=np.uint32)
In [11]: %timeit np.bitwise_or(a,a, out=a)
506 µs ± 2.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [12]:
In [12]: import numpy as np; import _fast_numpy_loops as fa
In [13]: fa.initialize()
taking over func add
taking over func subtract
taking over func multiply
taking over func true_divide
taking over func floor_divide
taking over func power
taking over func remainder
taking over func logical_and
taking over func logical_or
taking over func bitwise_and
taking over func bitwise_or
taking over func bitwise_xor
In [14]: %timeit np.bitwise_or(a,a, out=a)
--> 514 µs ± 7.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
** dont think getting called here
In [15]: a=np.arange(1_000_000, dtype=np.uint64)
In [16]: %timeit np.bitwise_or(a,a, out=a)
82.1 µs ± 2.65 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [17]: a=np.arange(1_000_000, dtype=np.int8)
In [18]: %timeit np.bitwise_or(a,a, out=a)
30.5 µs ± 2.57 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
the speed of bitwise_or seems off compared to when i test with riptable
on my computer, in riptable, the bitwise_or of 1million int32 takes 59 micoseconds when i use the same code to speed it up for numpy (we want numpy as fast as riptable or faster...) that's where I am putting my energy..
I get 1.51ms... something is off, not sure what. I suspect something internal about numpy, but I am not sure. It could be the hook messed it up? but then i tried without hooking and got the same slower speed.
more information on this -- it works fine for int8,16,and 64 -- it is int32 or uint32 is somehow different. get called back for int8/16/64 as I can see from speeds below. do not get called back for int32. Also not sure why numpy takes 503us for int8 and 589us for int64 which is 4 times larger (should take about 4 times longer), but not sure we care since once properly taken over, this will be solved.