Here it is a snipped showing how scalar UDF fails (mainly for float32, but also for some Python idioms):
import numpy as np
import iarray as ia
from iarray import udf
@udf.scalar(lib="lib")
# def fcond(a: udf.float32, b: udf.float32) -> float: # does not work for arrays with float32
def fcond(a: udf.float64, b: udf.float64) -> float:
# The next does not work:
if (a + b) > 3:
return 1
else:
return 0
# The one below works (for float64)
# c = 0.
# if (a + b) > 3:
# c = 1.
# return c
#
# return 1 if (a + b) > 3 else 0 # this also works (for float64)
N = 10_000_000
dtype = np.float64 # change here for testing the float32
print("** scalar udf evaluation ...")
a1 = ia.arange([N], dtype=dtype)
a2 = ia.ones([N], dtype=dtype)
expr = ia.expr_from_string("lib.fcond(a, b)", {"a": a1, "b": a2})
b1 = expr.eval()
print("** numpy evaluation ...")
b2 = (a1.data + a2.data) > 3
print(b2)
np.testing.assert_array_equal(b1.data, b2)
It turns out that float32 is an important data type for us, so this has priority. Also, it would be nice if the snipped above could be run for a number of int types (int8, 16, 32 and 64); however, this is not as important for now.
Here it is a snipped showing how scalar UDF fails (mainly for float32, but also for some Python idioms):
It turns out that float32 is an important data type for us, so this has priority. Also, it would be nice if the snipped above could be run for a number of int types (int8, 16, 32 and 64); however, this is not as important for now.