launchbadge / sqlx

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.
Apache License 2.0
12.81k stars 1.22k forks source link

error occurred while decoding column 43: unexpected null; try decoding as an `Option` - But struct field *IS* already Option<string> #3336

Open reasv opened 1 month ago

reasv commented 1 month ago

Bug Description

I'm running into what I believe must be a bug in SQLx.

I get this error when using query_as! with my struct:

error occurred while decoding column 43: unexpected null; try decoding as an `Option`

In order to identify which column was causing the problem, I replaced a bunch of columns in the select statement with static values, ensuring they would not be null, like this

SELECT
table.column as column_name

replaced with this

SELECT
'value' as column_name

Which worked. I identified the column, which indeed is NULL as a result of a LEFT JOIN in the query (there are several chained LEFT JOIN).

However, the field in the struct is already Option<String>, so SQLx should have no problem populating it with None.

This is a bug and clearly not intended behaviour. I have no idea what kind of edge case triggered this issue, but the query contains several successive LEFT JOIN clauses where columns from the previous LEFT JOIN are used in the ON clause for the next LEFT JOIN, fwiw.

Important Note

The error does not happen if I do this:

SELECT
NULL as column_name

with the problematic column. So, despite SQLx claiming that the unexpected null in that field is a problem, it only seems to happen when the null comes as a result of the JOIN. If the NULL is a constant value, it has no issue populating the struct field with None.

Minimal Reproduction

https://github.com/reasv/mitsuba/blob/master/src/db.rs#L667C1-L668C1 The columns causing the issue are all three of file_sha256, mitsuba_file_hidden and thumbnail_sha256

Info

spencerbart commented 1 month ago

I've run into this problem many times. You have to use the function instead of the macro. The sqlx prepare compiler is not generating the query plan correctly.

https://docs.rs/sqlx/latest/sqlx/fn.query_as.html

abonander commented 1 month ago

@reasv this can happen if the database server returns columns in a different order than the server it was compiled with. I see your SELECT starts with posts.*, I recommend listing those columns explicitly instead.

reasv commented 1 month ago

I recommend listing those columns explicitly instead.

That's highly impractical given there's 40 columns and I need to make a similar query in several different places - which would force me to copy and paste it everywhere. I tried to concatenate strings using compile time macros in order to avoid repetition, but that doesn't seem possible with the query_as! macro.

ORMx has some kind of different macro, but last time I checked it was still on SQLx 0.6.x

this can happen if the database server returns columns in a different order than the server it was compiled with.

I don't think this is exactly what's happening because the error also happens on my development machine when compiling and running using cargo run, the server in question being available and connected - there is no other server it could have used as a reference.

You have to use the function instead of the macro.

I'd rather avoid losing type safety as it has been fundamental to easily refactoring the code in the past after schema changes.

I will use a different workaround:

SELECT
posts.*,
CASE
    WHEN
        files.sha256 IS NOT NULL
    THEN files.sha256
    ELSE NULL
END AS file_sha256,
CASE
    WHEN
        thumbnails.hidden IS NOT NULL
    THEN thumbnails.hidden
    ELSE NULL
END AS mitsuba_file_hidden,
CASE
    WHEN
        thumbnails.sha256 IS NOT NULL
    THEN thumbnails.sha256
    ELSE NULL
END AS thumbnail_sha256
FROM posts

I just wish I could avoid copying and pasting it.

abonander commented 1 month ago

It's not really documented, but the query macros will accept a query string in fragments separated by +. This is specifically to support the generation of query strings by macros.

concat!() doesn't work because macros are expanded from the outside-in, though we could always add a special case for it. Alternatively, we could utilize an attribute proc macro as arguments in attributes are eagerly expanded, that's how you can do #[doc = include_str!("foo.md")] which we've utilized in a few places. One of a thousand big refactors I've been meaning to do when I can find the time.

Can you post the output of cargo expand for the query_as!() invocation? Also, the output of EXPLAIN (VERBOSE, FORMAT JSON) <query>.

Thinking about it, it's distinctly possible that the macros are incorrectly inferring that the columns are not nullable. They effectively emit code like field: row.try_get::<InferredType>()?.into() and that .into() could be hiding faulty null inference. The nullability inference errs on the side of false positives to try to avoid this exact situation, but it's far from perfect.

You could test for this by using a null override: CASE ... END AS "file_sha256?", etc.

reasv commented 1 month ago

explain_plan_1721209993422 JSON: https://gist.github.com/reasv/62df861d2bd96c74e69f1906e5e1f082 SQL: https://gist.github.com/reasv/23f6b872800585126e1fe7596d456a22

You could test for this by using a null override: CASE ... END AS "file_sha256?", etc.

Given that the current workaround

CASE
    WHEN
        files.sha256 IS NOT NULL
    THEN files.sha256
    ELSE NULL
END AS file_sha256

works, I would assume this should work also, but without the need for the case statement. I'll try it. I didn't know it was possible.

It's not really documented, but the query macros will accept a query string in fragments separated by +. This is specifically to support the generation of query strings by macros.

Not entirely sure how to use this given that, as you say, macros are expanded from the outside in. It also doesn't seem to work: image

But I am probably misunderstanding your suggestion.

reasv commented 1 month ago

I would assume this should work also, but without the need for the case statement. I'll try it. I didn't know it was possible.

SELECT
posts.*,
files.sha256 AS \"file_sha256?\",
thumbnails.hidden AS \"mitsuba_file_hidden?\",
thumbnails.sha256 AS \"thumbnail_sha256?\",

appears to solve the issue and is not nearly as verbose as my workaround using CASE statements. This is a good enough solution for me, especially if I can figure out how to avoid repeating the same query fragment using + as you suggested. As I mentioned in my previous post, it doesn't seem to work, at least not in the way I expected.