karlseguin / pg.zig

Native PostgreSQL driver / client for Zig
MIT License
148 stars 8 forks source link

Bind parameters causing invalid byte sequence error #27

Closed BTOdell closed 22 hours ago

BTOdell commented 5 days ago

When running the following query with a single i32 bind parameter,

SELECT pg_notify('channel', CAST($1 AS text));

I get this error:

invalid byte sequence for encoding "UTF8": 0x00

If I manually replace the bind parameter with a value,

SELECT pg_notify('channel', CAST(1 AS text));

the query executes successfully.

Also, if I serialize the i32 to a []u8 beforehand and pass the string to the bind parameter instead and replace the CAST($1 AS text) with just $1, like:

SELECT pg_notify('channel', $1);

then the query also executes successfully.

It seems that I've isolated the issue down to a potential bug in the binding logic in this library. I'm using the rowOpts function to execute the query as I only expect a single row back.

karlseguin commented 4 days ago

I'm on the fence as to whether or not this is a bug.

You're passing an integer into a text parameter. Should the library convert it? In this case (int -> text), it would be simple enough. But I'm afraid of the non trivial cases, like float -> text, or worse.

Using select pg_notify('channel', cast($1::int as text)) solves the issue.

Honestly open to feedback.

(Could have better error handling for this)

karlseguin commented 4 days ago

The readme touches on this a little, but isn't explicit about what it will and won't convert:

"...when binding a value to an SQL parameter, the library is a little more generous. For example, an u64 will bind to an i32 provided the value is within range.

This is particularly relevant for types which are expressed as []u8. For example a UUID can be a raw binary [16]u8 or a hex-encoded [36]u8. Where possible (e.g. UUID, MacAddr, MacAddr8), the library will support binding either the raw binary data or text-representation. When reading, the raw binary value is always returned."

karlseguin commented 4 days ago

If we do decide to change the behavior, either to convert or to at least improve the error messaging, this is a simple repo test:

test "PG: bind conversion" {
    var c = t.connect(.{});
    defer c.deinit();

    var row = c.row("select $1 as text", .{@as(i32, 1)}) catch |err| {
        std.debug.print("{s}\n", .{c.err.?.message});
        return err;
    };
    defer row.?.deinit() catch {};
}
BTOdell commented 4 days ago

I find this really bizarre how select pg_notify('channel', cast($1::int as text)) solves the issue. The :: is the cast operator which I would expect to do the same thing as CAST(...). Does pg.zig do anything special when it sees the :: operator?

In my mind, this is casting the bind parameter to an int first and then (back?) to text? What was the type of the bind parameter to begin with?

karlseguin commented 4 days ago

No. pg.zig never parses the SQL. This is all on the PostgreSQL side. You can see it for yourself in psql

> prepare test as select $1 as text;
> select parameter_types from pg_prepared_statements where name = 'test';
 parameter_types
─────────────────
 {text}
> deallocate test;

> prepare test as select $1::int as text;
> select parameter_types from pg_prepared_statements where name = 'test';
 parameter_types
─────────────────
 {integer}
> deallocate test;

If you just do select $1 you'll see that the default parameter type is text. So the difference is really between select $1 and select $1::int. The cast doesn't inform the parameter types. So even select $1 as int will show $1 as a text parameters.

So :: and cast are very different. One is a type definition on the parameter (which PostgreSQL is usually good at inferring) and the other is some thing for the SQL engine to deal with.

BTOdell commented 22 hours ago

Going to close this as I agree with you now that it's not a bug. Thank you for providing a solution to my problem and for your patience :)