kuzudb / kuzu-docs

http://docs.kuzudb.com/
Creative Commons Attribution Share Alike 4.0 International
2 stars 10 forks source link

Decimal type in UDFs need to be documented #245

Open prrao87 opened 3 weeks ago

prrao87 commented 3 weeks ago

Original issue: https://github.com/kuzudb/kuzu/issues/4082

The main problem is that the Decimal type in Python doesn't have a clean, intuitive way to specify the number of digits after the decimal point. When a Python UDF is trying to perform operations on Decimal types, there could be a loss of information as the data moves between float in Python and DECIMAL(7,2) in Kùzu. This can be addressed by explicitly specifying UDF's parameter and return types, as shown below.

# --- UDF ---

def calculate_discounted_price(price: float, has_discount: bool) -> float:
    "Assume 10% discount on all items for simplicity"
    return price * 0.9 if has_discount else price

# define the expected type of the UDF's parameters
parameters = ['DECIMAL(7, 2)', kuzu.Type.BOOL]

# define expected type of the UDF's returned value
return_type = 'DECIMAL(7, 2)'

# register the UDF
conn.create_function("current_price", calculate_discounted_price, parameters, return_type)

Unlike the built-in native types, e.g., kuzu.Type.BOOL, we specify a string DECIMAL(7,2), that's then parsed and used by the binder in Kùzu to map to the internal Decimal representation. This isn't intuitive, and should be explained clearly in the docs.