cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.03k stars 3.8k forks source link

sql: incorrect typing of input expressions of operators #132268

Open mgartner opened 1 week ago

mgartner commented 1 week ago

Consider the example:

DROP TABLE IF EXISTS t;

CREATE TABLE t (
  id INT PRIMARY KEY,
  c CHAR(20)
);

INSERT INTO t VALUES (1, 'foo');

-- In Postgres, these all return a single row.
-- In CRDB, the second query returns no rows.
SELECT * FROM t WHERE c = 'foo';
SELECT * FROM t WHERE c = 'foo        ';
SELECT * FROM t WHERE c = 'foo        '::CHAR(20);

I believe CRDB is not matching PG's behavior here because our type-checking logic does not match PG's. In SELECT * FROM t WHERE c = 'foo ', we type the 'foo ' literal as a STRING not a CHAR(20). Therefore, the trailing whitespace is consider significant, and the filter incorrectly excludes the row in the table. When we force the 'foo ' literal to be a CHAR(20) in the last query, the row is correctly returned.

Postgres's docs highlight the type conversion logic that should be followed in this case:

  1. Check for an operator accepting exactly the input argument types. If one exists (there can be only one exact match in the set of operators considered), use it. Lack of an exact match creates a security hazard when calling, via qualified name [9] (not typical), any operator found in a schema that permits untrusted users to create objects. In such situations, cast arguments to force an exact match.

    a. If one argument of a binary operator invocation is of the unknown type, then assume it is the same type as the other argument for this check. Invocations involving two unknown inputs, or a prefix operator with an unknown input, will never find a match at this step.

A key thing to call out here is that the string literal should be initially typed as UNKNOWN in order for this conversion rule to apply. We don't currently do that, so #94718 might be a prerequisite to fixing this, unless we can find a clever workaround.

There is a related meta issue for better conformity to PG's type checking and conversion: https://github.com/cockroachdb/cockroach/issues/75101

Workaround

The only workaround I'm aware of is to explicitly cast the literal to a CHAR(N) or BPCHAR type (see the third query in the example above). The latter should work for comparisons with CHAR(N) types for all N, without having to specify N.

Jira issue: CRDB-42907

blathers-crl[bot] commented 1 week ago

Hi @mgartner, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.