this only works in duckdb (or any backend with a builtin function called damerau_levenshtein).
I have some library function like def address_similarity(a1: ir.StringValue, a2: ir.StringValue) -> ir.FloatingValue. Internally it wants to use damerau levenshtein string edit distance to calculate the score. But, when a user hands me an abstract expression, I don't know what backend they are hoping to execute it on. If they are going to execute it on duckdb, then using the building UDF would work fine. But if they are going to execute it on a different backend, then I would want to fall back to some python/pyarrow UDF. But I don't know which to do at expression creation time!
Describe the solution you'd like
spitballing here:
# other args like name, database, etc aren't allowed here. This is just creating the contract on the ibis side.
@ibis.udf.scalar(signature=...)
def damerau_levenshtein(left: str, right: str) -> int: ...
# now we plug in implementations...
@damerau_levenshtein.builtin(backends=["duckdb", ...], name="damerau_levenshtein", database=...)
def _damerau_levensthein_duckdb(): ...
# backends=None means use this as the fallback
@damerau_levenshtein.python(backends=None, database=...)
def _damerau_levensthein_udf(s1: str, s2: str) -> str:
return somelib.damlev(s1, s2)
def address_similarity(a1, a2):
return damerau_levenshtein(a1, a2)
The old APIs should remain working as they did, I don't think they need to change?
What version of ibis are you running?
main
What backend(s) are you using, if any?
No response
Code of Conduct
[X] I agree to follow this project's Code of Conduct
Is your feature request related to a problem?
I have this UDF:
this only works in duckdb (or any backend with a builtin function called damerau_levenshtein).
I have some library function like
def address_similarity(a1: ir.StringValue, a2: ir.StringValue) -> ir.FloatingValue
. Internally it wants to use damerau levenshtein string edit distance to calculate the score. But, when a user hands me an abstract expression, I don't know what backend they are hoping to execute it on. If they are going to execute it on duckdb, then using the building UDF would work fine. But if they are going to execute it on a different backend, then I would want to fall back to some python/pyarrow UDF. But I don't know which to do at expression creation time!Describe the solution you'd like
spitballing here:
The old APIs should remain working as they did, I don't think they need to change?
What version of ibis are you running?
main
What backend(s) are you using, if any?
No response
Code of Conduct