ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.22k stars 592 forks source link

bug: re_replace() doesn't fill params right? #5522

Closed NickCrews closed 1 year ago

NickCrews commented 1 year ago

What happened?

Thanks for looking at this.

import ibis
# this errors:
ibis.literal("hi").re_replace("x", "y").substitute({"d": "b"}, else_="k")
# This works fine:
# ibis.literal("hi").re_replace("x", "y")
# this works fine:
# ibis.literal("hi").substitute({"d": "b"}, else_="k")

It generates:

SELECT CASE regexp_replace(:param_1, :param_2, :param_3, :regexp_replace_1) WHEN :param_4 THEN :param_5 ELSE :param_6 END AS "SimpleCase(RegexReplace('hi', 'x', 'y'), 'k')"

which results in

Error: Invalid Input Error: Unrecognized Regex option d
[SQL: SELECT CASE regexp_replace(?, ?, ?, ?) WHEN ? THEN ? ELSE ? END AS "SimpleCase(RegexReplace('hi', 'x', 'y'), 'k')"]
[parameters: ('hi', 'x', 'y', 'g', 'd', 'b', 'k')]

So something about how ibis doesn't correctly fill in :regexp_replace_1 with the constant "g" flag as it should, but the value I passed in, "d" from the substitute(), is used instead.

Lemme know if you think of a workaround, I could write my own extension function for re_replace() until this is fixed, but maybe there's something easier.

What version of ibis are you using?

4.1.0

What backend(s) are you using, if any?

duckdb

Relevant log output

No response

Code of Conduct

cpcloud commented 1 year ago

This is an unfortunate artifact of using the numeric style for bind parameters and perhaps a bug in duckdb-engine (though that would be surprising).

In any event we can fix this by using sa.text. Fix incoming.