diesel-rs / diesel

A safe, extensible ORM and Query Builder for Rust
https://diesel.rs
Apache License 2.0
12.7k stars 1.07k forks source link

Our type system can panic when comparing nullable values #104

Closed sgrif closed 8 years ago

sgrif commented 8 years ago

I should note that this is relatively hard to actually demonstrate in normal code. The issue here is that x = NULL in SQL returns NULL. I've managed to distill it into the following test case:

#[test]
fn test_comparing_null() {
    use diesel::prelude::*;
    use diesel::select;
    use diesel::expression::AsExpression;
    use diesel::types::{Nullable, Integer};

    let query = select(AsExpression::<Nullable<Integer>>::as_expression(None::<i32>).eq(Some(1)));
    query.get_result::<bool>(&connection()).unwrap();
}

The query here is ultimately

SELECT NULL = 1;

The problem is that the sql type of Eq<T, U> is always Bool, and so we panic on an "unexpected null value" when trying to read it. We need to have the type of Eq<Nullable<T>, Nullable<U>> to be Nullable<Bool>.

I should also note that this only applies to these sorts of expressions appearing in a select clause. It does not affect us in places like where clauses. Also as an aside, in places like where clauses, NULL is effectively the same as false which is part of why I'm leaning towards the second answer.

There are basically two answers here:

sgrif commented 8 years ago

I've raised questions on the specialization RFC related to this. I think the real answer under specialization might involve adding a NotNull trait and IsNullable trait (redundant with Nullable<T>, maybe potentially replacing it with T: NativeSqlType + IsNullable), and having the following:

impl<T, U> Expression for Eq<T, U> where
    T: Expression,
    U: Expression<SqlType=T::SqlType>,
    T::SqlType: NotNull,
{
    type SqlType = Bool;

    fn to_sql(...) { ... }
}

impl<T, U> Expression for Eq<T, U> where
    T: Expression,
    U: Expression<SqlType=T::SqlType>,
    T::SqlType: IsNullable,
{
    type SqlType = Nullable<Bool>;

    fn to_sql(...) { // identical to previous }
}

impl<T, U> Expression for Eq<T, U> where
    T: Expression,
    U: Expression<SqlType=T::SqlType>,
    T::SqlType: IsNullable + NotNull,
{
    type SqlType = ();

    fn to_sql(...) { unreachable!("No type should impl NotNull and IsNullable"); }
}
sgrif commented 8 years ago

As I think through this, I'm becoming more confident that this problem is scoped entirely too booleans coming from infix predicates. While technically you can run into the same problem with SELECT 1 + NULL, we don't actually implement Add for Nullable<Anything>. So another way to put this is that the problem only applies in cases where we have an unbounded constraint on the input types. At first glance, that appears to only apply to infix predicates which return a boolean (and IsNull/IsNotNull, but this doesn't apply to those for obvious reasons).

Given the idea of "truthy"/"falsey" in other languages, I think that treating NULL as false is a reasonably comfortable conversion. This would not remove the idea of a Nullable<Bool>, which you could certainly still have as a column. It would only affect cases that we at present cannot correctly handles, which is nullable_expression.eq(other_nullable_expression).

As a side note, if we ignore other backends, this problem is not solved by just mapping Eq to IS DISTINCT FROM in PG. IS DISTINCT FROM is not treated as an operator, so we cannot do things like x IS DISTINCT FROM ANY(some_array). The problem also applies to cases like > and <.

I'm starting to become amicable to this over a solution which changes the type to Nullable<Bool>, as unless we suddenly decide to allow T to be comparable to Nullable<T>, which we currently prevent by design, any case involving NULL would bubble up through AND in painful ways. For example x.eq(y).and(nullable_x.eq(nullable_y)) would not compile, as both sides would need to be Nullable.

Other cases where this does apply in SQL are essentially functions, and operators like +, which do have bounded input. We don't currently have elegant handling of multiple types for functions, but for operators like +, it's completely possible with our types as set up today to have Nullable<Integer> + Nullable<Integer> return Nullable<Integer>. What is not possible today is to have Integer + Nullable<Integer> (looking through the commit history, I'm not sure I fully understand why I wrote Rhs as an associated type, and not a type parameter. Presumably it's due to wanting to use AsExpression, and the inability to say impl<Rhs> ::std::ops::Add<Rhs> where { there is exactly one type for which "AsExpression" is implemented on Rhs, and ::diesel::ops::Add exists } but I need to confirm.

sgrif commented 8 years ago

Interesting point from IRC: We could actually handle Eq similarly to Add, where we have a EqResult trait of some kind with a known output. This would require adding brute force impls for every native type, but we can remove a lot of that pain with macros, similarly to how we handle Add for numeric types, and Queriable/AsExpression for primitive types. This would also enable us to add equality comparisons that we don't support today like varchar.eq(text). We'd still have the same x.eq(y).and(nullable_x.eq(nullable_y)), but we might be able to figure out some way to implement impl<T: Expression<SqlType=Bool>> AsExpression<Nullable<Bool>> for T that doesn't overlap with our existing impl<T: Expression> AsExpression<T::SqlType> for T

sgrif commented 8 years ago

I've decided to pull the trigger on treating NULL as false in this particular case during deserialization. I'm reasonably sure that this can only manifest itself with booleans, and I'm comfortable with that coercion for that type in particular as it essentially mimics it's behavior in boolean contexts in SQL outside of a select clause.

We should continue to keep an eye on this and make sure it cannot manifest itself for any types other than bool.

I believe we might be able to fix this with specialization by specializing SelectableExpression rather than Expression. Something like this:

impl<T, U, QS, ST> SelectableExpression<QS, Bool> for Eq<T, U> where
    ST: NotNull,
    T: SelectableExpression<QS, ST>,
    U: SelectableExpression<QS, ST>,
{
}
impl<T, U, QS, ST> SelectableExpression<QS, Nullable<Bool>> for Eq<T, U> where
    ST: NotNull,
    T: SelectableExpression<QS, Nullable<ST>>,
    U: SelectableExpression<QS, Nullable<ST>>,
{
}