launchbadge / sqlx

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.
Apache License 2.0
12.98k stars 1.24k forks source link

Potential vulnerability: overflowing and truncating casts #3440

Closed abonander closed 3 weeks ago

abonander commented 4 weeks ago

Context

User "Sytten" on Discord brought to our attention the following presentation from this year's DEFCON: https://media.defcon.org/DEF%20CON%2032/DEF%20CON%2032%20presentations/DEF%20CON%2032%20-%20Paul%20Gerste%20-%20SQL%20Injection%20Isn't%20Dead%20Smuggling%20Queries%20at%20the%20Protocol%20Level.pdf

(Note: if there's a public page or blog post for this, or one is posted in the future, please let me know so I can include it for posterity.)

Essentially, encoding a value larger than 4GiB can cause the length prefix in the protocol to overflow, causing the server to interpret the rest of the string as binary protocol commands (or other data? it's not clear):

image image

It appears SQLx does perform truncating casts in a way that could be problematic, for example: https://github.com/launchbadge/sqlx/blob/6f2905695b9606b5f51b40ce10af63ac9e696bb8/sqlx-postgres/src/arguments.rs#L163

This code has existed essentially since the beginning, so it is reasonable to assume that all published versions are affected.

It's hard to glean just from the slides exactly how this may be exploited, and in any case it requires a malicious input at least 4 GiB long.

Mitigation

As always, you should make sure your application is validating untrustworthy user input. Reject any input over 4 GiB, or any input that could encode to a string longer than 4 GiB. Dynamically built queries are also potentially problematic if it pushes the message size over this 4 GiB bound.

Encode::size_hint() can be used for sanity checks, but do not assume that the size returned is accurate. For example, the Json<T> and Text<T> adapters have no reasonable way to predict or estimate the final encoded size, so they just return size_of::<T>() instead.

For web application backends, consider adding some middleware that limits the size of request bodies by default.

Resolution

I have started work on a branch that adds #[deny] directives for the following Clippy lints:

and I'm auditing the code that they flag. This is the same approach being used by Diesel: https://github.com/diesel-rs/diesel/pull/4170

In the process I realized that our CI wasn't running a lot of our unit tests, so I'm fixing that as well.

After the fix, attempting to encode a value larger than is allowed in a given binary protocol will return an error instead of silently truncating it.

I will also be filing a RUSTSEC advisory.

abonander commented 4 weeks ago

@lovasoa you have issues disabled on sqlx-oldapi so I'm pinging you here to make sure you're aware of this.

I'll leave it up to you whether to file your own RUSTSEC advisory.

lovasoa commented 4 weeks ago

Thank you very much @abonander ! I just activated issues in sqlx-oldapi, and will look into this.

Sytten commented 4 weeks ago

I will link the talk when it is available. Talk summary: https://defcon.org/html/defcon-32/dc-32-speakers.html#54466

caizixian commented 4 weeks ago

(Note: if there's a public page or blog post for this, or one is posted in the future, please let me know so I can include it for posterity.)

There's a version available on the wayback machine https://web.archive.org/web/20240814175011/https://media.defcon.org/DEF%20CON%2032/DEF%20CON%2032%20presentations/DEF%20CON%2032%20-%20Paul%20Gerste%20-%20SQL%20Injection%20Isn't%20Dead%20Smuggling%20Queries%20at%20the%20Protocol%20Level.pdf

touilleMan commented 3 weeks ago

Hi,

tl;dr: Is this attack PostgreSQL-only or do you think MySQL/SQLite could be also affected ?

From the screenshot, this attack seems to only impact the PostgreSQL backend. However it is likely the other backends are written in a similar manner and hence might also be vulnerable (though in different ways since the binary protocol is obviously different).

I would say SQLite backend is most likely not affected since it doesn't use a binary protocol as far as I am aware.

abonander commented 3 weeks ago

The MySQL driver has some suspect casts but the actual packet length encoding is sound from what I've seen.

However, because the MySQL protocol is more contextual than the Postgres protocol, there may be other length-encoded values that can overflow and cause misinterpretation.

I haven't gotten to the COM_STMT_EXECUTE encoding yet, that's the most likely culprit.

I don't have a solid answer for SQLite yet. Yes, it doesn't have a binary protocol, but we do have to pass lengths for things like blobs when we bind them and those may have incorrect casts since C APIs tend to prefer signed integers.

abonander commented 3 weeks ago

Update: the MySQL driver appears to be mostly fine, we already had strong checks on the lengths of encoded packets thanks to the support for packet splitting. The rest of the audit there was just adding sanity checks.

The SQLite driver had one cast that concerned me when we're about to call sqlite3_prepare_v3(): https://github.com/launchbadge/sqlx/pull/3441/files#diff-6a4648966ef9596ce442425261911d80883a3fc1132b2ac37c7c72140669f859L166

If query_len overflows, that could turn into SQL injection if the split in the query string caused by the overflow produces two substrings that are both valid SQL. However, if an attacker has sufficient control over the SQL your application executes to pull off this exploit, you likely already have a normal SQL injection vulnerability anyway.

The SQLite3 API indeed loves to sprinkle c_ints everywhere that we often want to cast to usize. I chose to turn overflows into panics there because signed integer overflow is UB in the C standard and thus the SQLite3 API should never be returning negative values unless otherwise specified. The other edge case is overflowing usize on 16-bit platforms but I don't really care about that at this time.

abonander commented 3 weeks ago

Update deux: CI on #3441 is passing, I'd like to at least try to reproduce the exploits on the base commit to have a good regression test.

abonander commented 3 weeks ago

I have created a regression test based on main that demonstrates the exploit with Postgres: https://github.com/launchbadge/sqlx/commit/7dbacb6454de4d89425ab4b6bf50a83bf0d4b9c9

/home/austin/.cargo/bin/cargo test --color=always --package sqlx --test postgres-rustsec rustsec_2024_0363 --features postgres,runtime-tokio --no-fail-fast --config env.RUSTC_BOOTSTRAP=\"1\" -- --format=json --exact -Z unstable-options --show-output
Testing started at 5:33 PM ...
    Blocking waiting for file lock on build directory
    Finished `test` profile [unoptimized + debuginfo] target(s) in 7.31s
     Running tests/postgres/rustsec.rs (target/debug/deps/postgres_rustsec-5b5126b7e67d89a0)

assertion `left == right` failed
  left: ["you've been pwned!", "fake_msg"]
 right: ["existing message", "fake_msg"]

Left:  ["you've been pwned!", "fake_msg"]
Right: ["existing message", "fake_msg"]
abonander commented 3 weeks ago

I'm actually oddly proud of this because I improved upon the slides by figuring out a more versatile method of padding the payload in a way that won't break the connection, at least from a protocol perspective.

As it turns out, the Postgres backend will read and discard Flush messages without asserting that they're empty (it just limits them to PQ_SMALL_MESSAGE_LIMIT, 10,000 bytes): https://github.com/postgres/postgres/blob/ff59d5d2cff32cfe88131f87b6c401970d449c08/src/backend/tcop/postgres.c#L435

This means that, not only can we use them to pad the payload, but we can also use a final Flush message to eat the remainder of the original message, without causing any additional responses that might break the connection.

However, ironically enough, the connection will still break because the injected Query message with our attack payload will cause an extra ReadyForQuery response that our PgConnection isn't expecting, causing this decrement to underflow: https://github.com/launchbadge/sqlx/blob/6f2905695b9606b5f51b40ce10af63ac9e696bb8/sqlx-postgres/src/connection/mod.rs#L106

Which will then hang in this loop as it's going to expect 232 - 1 more ReadyForQuery messages to be sent: https://github.com/launchbadge/sqlx/blob/6f2905695b9606b5f51b40ce10af63ac9e696bb8/sqlx-postgres/src/connection/mod.rs#L82-L88

abonander commented 3 weeks ago

As it turns out, what I thought might be a similar vulnerability in the SQLite driver turned out to be a non-issue because I had missed this check: https://github.com/launchbadge/sqlx/blob/6f2905695b9606b5f51b40ce10af63ac9e696bb8/sqlx-sqlite/src/statement/virtual.rs#L59-L64

Curiously, it appears the clippy::cast_sign_loss lint ignored this one. It must have a special-case for i32::MAX. Not sure why clippy::cast_possible_truncation didn't trigger though, since it lints for cases where usize is 16 bits (addendum: I guess I was misremembering this).

abonander commented 3 weeks ago

As expected, MySQL doesn't appear to be exploitable in this fashion. It actually has pretty tight limits on packet sizes by default and is hardcoded not to accept any packet larger than 1 GiB: https://dev.mysql.com/doc/refman/8.4/en/packet-too-large.html

abonander commented 3 weeks ago

Final update: 0.8.1 has been released with #3441: https://github.com/launchbadge/sqlx/blob/main/CHANGELOG.md#081---2024-08-23

Sytten commented 3 weeks ago

Thanks for the reactive fix! I reported the problem on Discord and I also forwarded that to sea-orm which is based on sqlx and used in many businesses.

marcospb19-cw commented 1 week ago

For those who are facing the breaking changes of 0.8.0 when updating version, here is a reference PR on how to fix them:

https://github.com/cloudwalk/stratus/pull/1673/files

Both encode returning Result and the GATs change are detailed in the CHANGELOG.

Posting here just in case I save someones time :).