tembo-io / pgmq

A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
PostgreSQL License
2.68k stars 71 forks source link

SQL Injection #295

Closed olirice closed 3 months ago

olirice commented 3 months ago

Some of the functions exposed in pgmq are susceptible to SQL injection

Here's a reproducible example:

create extension pgmq;

select pgmq.create('abc');

select
   pgmq.delete(
     'abc where false;
     create table public.attack_vector(id int); -- Any SQL can be placed here
     delete from pgmq.q_abc',
     1
  );

select * from public.attack_vector; -- Demonstrate that table was created

and see that the table public.attack_vector was created.

It looks like several other functions doing SQL interpolation with %s are susceptible both in the SQL and over client libraries

ChuckHend commented 3 months ago

Thanks for raising the issue @olirice. Does changing the %s to %I resolves this? Will have a PR up soon.

olirice commented 3 months ago

It solves this problem but introduces backwards incompatibility issues

For example

-- No change
select format('delete from pgmq.q_%s;', 'foo') -- delete from pgmq.q_foo
select format('delete from pgmq.q_%I;', 'foo') -- delete from pgmq.q_foo

-- Output changed
select format('delete from pgmq.q_%s;', 'FOO') -- delete from pgmq.q_FOO;
--  ^ which actually gets interpreted as delete from pgmq.q_foo

select format('delete from pgmq.q_%I;', 'FOO') -- delete from pgmq.q_"FOO";  -- which is invalid

so it would break anyone with special characters in a table name.

olirice commented 3 months ago

I think to maintain backwards compatibility and plug the issue you'd want a function like

create function sanitize_table_name(name)
  returns text
  immutable
  language sql
as $$
  select format(pgmq.'%I', lower('q_' || name))
$$;

select sanitize_table_name('foo'); -- pgmq.q_foo
select sanitize_table_name('f$o'); -- pgmq."q_f$oo"
select sanitize_table_name('Foo'); -- pgmq.q_foo
select sanitize_table_name('Fo$o'); -- pgmq."q_fo$o"

which matches the legitimate use cases & continues to downcase any capital letters

ChuckHend commented 3 months ago

Do we need the lower()? Differentiating with upper and lower case are valid queue and table names.

format(pgmq.%I .... is how we're addressing it over in https://github.com/tembo-io/pgmq/pull/296 if you can take a peek.

olirice commented 3 months ago

Do we need the lower()

for backwards compatibility you would. If someone running v1.4.0 executes

select pgmq.create('Bar')

that will execute

CREATE TABLE IF NOT EXISTS pgmq.q_%s (

resulting in a table named pgmq.q_bar

If they then upgrade to 1.4.1 which includes a security fix replacing the table naming logic to

format('pgmq.%I  ... , 'q_' || $1) 

then any function calls they make using 'Bar' as the queue name will become pgmq."q_Bar" and they'll get a

ERROR:  table pgmq."Bar" does not exist 
olirice commented 3 months ago

(imo) the best way to make sure you get it correct would be to

  1. add some test cases with weird queue names on the current version 1.4.0
  2. Add a select table_name from information_schema.tables where table_schema = 'pgmq' to list out the table names
  3. Create a new version that solves the security issue
  4. re-run the test suite and make sure there's no pg_regress diff
ChuckHend commented 3 months ago

I see now, thank you. The intention has been to allow queue names to be any valid Postgres table name, so it kind of feels like a bug that we've been indirectly casting everything to lower. I'll take a look and see if that was introduced with the pgrx-ectomy or if its always been the case. I agree this would introduce some issues.

ChuckHend commented 3 months ago

Ok, so while it was the intention to have flexibility on the queue names, even the pgrx implementation resulted in lower cased table names

pgmq=# \dx
                                     List of installed extensions
  Name   | Version |   Schema   |                             Description                             
---------+---------+------------+---------------------------------------------------------------------
 pgmq    | 1.2.0   | pgmq       | A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

pgmq=# select pgmq.create('Bar');
 create 
--------

(1 row)

pgmq=# select pgmq.
pgmq=# \dt pgmq.*
          List of relations
 Schema | Name  | Type  |   Owner    
--------+-------+-------+------------
 pgmq   | a_bar | table | adamhendel
 pgmq   | meta  | table | adamhendel
 pgmq   | q_bar | table | adamhendel
(3 rows)
ChuckHend commented 3 months ago

@olirice we'll maintain backward compatibility, force all queue names to lower case like it has always behaved, and figure out if/how we want to support cased queue names another time. Thanks for the tip.

ChuckHend commented 3 months ago

Hi @olirice, https://github.com/tembo-io/pgmq/pull/296 is in the main branch now and we release 1.4.1 this morning. Thanks again for raising the issue. Is there anything else, or can we close the issue?

olirice commented 3 months ago

I'll review 1.4.1 now and see if anything is exploitable. Will follow up in a few hours

ChuckHend commented 3 months ago

I'll review 1.4.1 now and see if anything is exploitable

Thank you. We'll patch immediately if you find anything.

olirice commented 3 months ago

These format calls can be removed or need to be '%s' to avoid double escaping the table name https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L472 https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L476 https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L604 https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L609


This will fail if you have special characters in the queue name because queue_name Foo formats to "Foo" so when concatenated with _vt_idx you'll get "Foo"_vt_idx, which is not valid https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L484 https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L689


Unused variable https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L712C1-L712C27


If the reference to those is inside e.g. create_nonpartitioned it is probably also present in create_partitioned etc

If you exercise these functions using a queue name with special characters in tests that'd go a long way to identifying any issues.

ChuckHend commented 3 months ago

This will fail if you have special characters in the queue name because queue_name Foo formats to "Foo" so when concatenated with _vt_idx you'll get "Foo"_vt_idx, which is not valid

I think we can probably resolve these along with https://github.com/tembo-io/pgmq/issues/293 at same time.

ChuckHend commented 3 months ago

This will fail if you have special characters in the queue name because queue_name Foo formats to "Foo" so when concatenated with _vt_idx you'll get "Foo"_vt_idx, which is not valid

Trying to reproduce. We maintained backwards compatibility though so the queue names are cast to lower case.

\dx pgmq
                                 List of installed extensions
 Name | Version | Schema |                             Description                             
------+---------+--------+---------------------------------------------------------------------
 pgmq | 1.4.1   | pgmq   | A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.

select pgmq.create('Foo');

postgres=# \d pgmq.q_foo
                                      Table "pgmq.q_foo"
   Column    |           Type           | Collation | Nullable |           Default            
-------------+--------------------------+-----------+----------+------------------------------
 msg_id      | bigint                   |           | not null | generated always as identity
 read_ct     | integer                  |           | not null | 0
 enqueued_at | timestamp with time zone |           | not null | now()
 vt          | timestamp with time zone |           | not null | 
 message     | jsonb                    |           |          | 
Indexes:
    "q_foo_pkey" PRIMARY KEY, btree (msg_id)
    "q_foo_vt_idx" btree (vt)
theory commented 3 months ago

These format calls can be removed or need to be '%s' to avoid double escaping the table name

https://github.com/tembo-io/pgmq/blob/78a766bd750b5985ac590ac384b8d294601fe544/pgmq-extension/sql/pgmq--1.4.0--1.4.1.sql#L472

format_table_name() does not quote identifiers, just lowercase the whole thing. But we definitely don't need to format identifiers in function calls; presumably the functions do it. If that's true, then no need for format() at all:

IF NOT pgmq._belongs_to_pgmq(qtable) THEN
ChuckHend commented 3 months ago

Draft up to address the remaining stuff here + #293

ChuckHend commented 3 months ago

@olirice - if we've resolve the SQL injection concerns then I'd like to close this issue.

If there's any other issues or suggestions, can we create new issues to track them?

olirice commented 3 months ago

You'd have a hard time exploiting them currently but if those functions get refactored they could become dangerous

https://github.com/tembo-io/pgmq/blob/0ae3ba5aac25adda2936a3aac7b1e8230b55e70f/pgmq-extension/sql/pgmq.sql#L491 https://github.com/tembo-io/pgmq/blob/0ae3ba5aac25adda2936a3aac7b1e8230b55e70f/pgmq-extension/sql/pgmq.sql#L500

generally any '%s' with a string input that hasn't gone through a '%I' is dangerous

If it were me I'd feel more comfortable with some minimal sanitizing checks in pgmq.format_table_name to raise exceptions any of the following are used ;, ', $, or -- there's not much reason for any of those sequences to be in a queue name, but I recognize that's an opinionated stance

ChuckHend commented 3 months ago

Generally I'd like to allow any valid Postgres table name as a queue name, but we're already casting to lower case and seems reasonable to reject those to make the project overall more secure. That does break backwards compatibility though since we do allow those characters today.

ChuckHend commented 3 months ago

@olirice - this auto-closed when we merged #300, sorry about that. Please re-open if there is still a vulnerability. We're now raising an exception when queue names contain ;, ', $, or --. Also, we removed %s everywhere except for the two RAISE NOTICE.

olirice commented 2 months ago

lgtm! thanks for the quick turn