slick / slick

Slick (Scala Language Integrated Connection Kit) is a modern database query and access library for Scala
https://scala-slick.org/
BSD 2-Clause "Simplified" License
2.65k stars 608 forks source link

Simplify comparing Option values #2249

Open andyczerwonka opened 3 years ago

andyczerwonka commented 3 years ago

I have a use case where I have an optional column, and I want to write a filter that checks for a value when it's present or check for null it it's None.

I initially wrote:

table.filter(_.name === name)

In this case, when None, the query generates where name = null, which is not correct, I was expecting where name is null, which feels like a bug. I had to change my filter to:

table
  .filterOpt(name) { case (t, n) => t.name === n }
  .filterIf(name.isEmpty)(_.name.isEmpty)

That produces the right query, but it seems like I should get that behaviour for free when using an optional column. Those two filters are mutually exclusive, and I can't think of the use case where where name = null is something I want.

Is this a defect? Is there a different filter primitive I should be using instead?

andyczerwonka commented 3 years ago

We could probably do something like this.

table.filter(r => name match {
  case None => r.name.isEmpty
  case Some(n) => _.name === n
})

But that doesn't compile. Type slick.lifted.Rep[_ >: Boolean with Option[Boolean]] cannot be a query condition (only Boolean, Rep[Boolean] and Rep[Option[Boolean]] are allowed I'm sure we can coerce the types here.

patrick-mullen commented 2 years ago

I have this same issue, none of the examples online I can find compile for me and I get this same error. @andyczerwonka were you able to figure this out or work around it? Furthermore, i need to OR multiples of these togethor so chaining separate filter additions doesn't appear to work for my case either.

andyczerwonka commented 2 years ago

@patrick-mullen I ended up with what I describe in the ticket... I mean it works, I was just expecting table.filter(_.name === name) to work, i.e. in name is optional and empty, generate the is null query. It's not that clever.

patrick-mullen commented 2 years ago

Yeah i can't figure out how to do that if i had multiple that needed to be ORed togethor. I assume you're only doing one at a time? And it seems like theres no way to keep tacking on filters and have them ORed togethor. filterOpt also was added in 3.3, whereas I'm on 3.2.

patrick-mullen commented 2 years ago

The fact that === doesn't work as expected for optional values is pretty shocking, this seems like it would be a very common use case.

nafg commented 2 years ago

I think silently changing a basic operation would require a certain amount of consensus and a version bump. I have no idea what amount of consensus that should be or how to attain it.

andyczerwonka commented 2 years ago

I think silently changing a basic operation would require a certain amount of consensus and a version bump. I have no idea what amount of consensus that should be or how to attain it.

Agree. I don''t know the use case where users would explicitly want = null vs is null, both those cases would break.

hvesalai commented 2 years ago

I don't think None should be automatically turned into is null since the other option is to treat it as "anything"

For example, we have this extension in our code:

    implicit class ConditionalQueryFilter[A,B,C[_]](q: Query[A,B,C]) {
      def filterOpt[D, T <: Rep[_] : CanBeQueryCondition](option: Option[D])
                   (f: (A, D) => T): Query[A, B, C] =
        option.map(d => q.filter(a => f(a,d))).getOrElse(q)

      def filterIf(p: Boolean)(f: A => Rep[Boolean]): Query[A,B,C] =
        if (p) q.filter(f) else q
    }
nafg commented 2 years ago

We could probably do something like this.

table.filter(r => name match {
  case None => r.name.isEmpty
  case Some(n) => _.name === n
})

But that doesn't compile. Type slick.lifted.Rep[_ >: Boolean with Option[Boolean]] cannot be a query condition (only Boolean, Rep[Boolean] and Rep[Option[Boolean]] are allowed I'm sure we can coerce the types here.

I'm assuming instead of _.name === n you actually have r.name === n because otherwise I would expect you to get a different error message.

I guess the issue is that r.name.isEmpty is a Rep[Boolean] but `r.name === n is a Rep[Option[Boolean]]since Slick treats comparisons on Option columns to return Option results, since that is how SQL works:col = exprwherecolis nullable is itself nullable; if col is NULL thencol = exprevaluates toNULL`. So the right hand sides of the pattern matches don't line up.

You can turn a Rep[A] into a Rep[Option[A]] with the .? method. So this should work, I expect:

table.filter(r => name match {
  case None    => r.name.isEmpty.?
  case Some(n) => r.name === n
})
nafg commented 2 years ago

I don't think None should be automatically turned into is null since the other option is to treat it as "anything"

For example, we have this extension in our code:

    implicit class ConditionalQueryFilter[A,B,C[_]](q: Query[A,B,C]) {
      def filterOpt[D, T <: Rep[_] : CanBeQueryCondition](option: Option[D])
                   (f: (A, D) => T): Query[A, B, C] =
        option.map(d => q.filter(a => f(a,d))).getOrElse(q)

      def filterIf(p: Boolean)(f: A => Rep[Boolean]): Query[A,B,C] =
        if (p) q.filter(f) else q
    }

Note that filterOpt and filterIf have been added to Slick so your extension methods are probably not being called.

That said your code doesn't use Rep[Option[?]] === ... so it isn't really a counterexample.

However as I said it would be an almost silent breaking change. I think Quill made such a change at one point, not sure. So maybe we can find out what they did.

The safest solution would be to add a different operator. That way === has the same semantics as = in SQL and we don't break Slick code, and we provide a shorthand for the common SQL pattern of (x IS NULL AND y IS NULL) OR x=y.

But I'm not convinced it's the best solution.

andyczerwonka commented 2 years ago

I don't think None should be automatically turned into is null since the other option is to treat it as "anything"

I kind of agree, but in that case I would not be adding a filter condition at all. You're saying, if Some(value), filter the list, else don't filter anything, which is what filterOpt does.

patrick-mullen commented 2 years ago

The safest solution would be to add a different operator. That way === has the same semantics as = in SQL and we don't break Slick code, and we provide a shorthand for the common SQL pattern of (x IS NULL AND y IS NULL) OR x=y.

If i had to pick I believe this is the best solution. It's additive and reasonably intuitive because of how = and is are already different operators in SQL.

jimm-porch commented 2 years ago

I had a need for this today and came up with a couple solutions that work for me. FYI in case it helps anyone else here.

The naive bug

In my use-case, I have a table of mailing addresses, with columns like street1, street2, city, etc. Any of those values can be null, so the Slick columns are optional types like Option[String], Option[PostalCode], etc. I need to be able to find an existing row matching some target row, and my naive first attempt looked like this:

    addressTable
      .filter(_.street1 === row.street1)
      .filter(_.street2 === row.street2)
      .filter(_.city === row.city)
      .filter(_.state === row.state)
      .filter(_.postalCode === row.postalCode)
      .result
      .headOption

This compiled and executed fine, but it wouldn't match existing rows if they had a null in any column, because the generated SQL predicate would look like WHERE state = null, which never evaluates to true.

The ugly workaround

My first clumsy verbose workaround was to make a mutable query variable, but the code was really awful:

    var query = row.street1 match {
      case Some(value) => addressTable.filter(_.street1 === value)
      case None        => addressTable.filter(_.street1.isEmpty)
    }
    query = row.street2 match {
      case Some(value) => query.filter(_.street2 === value)
      case None        => query.filter(_.street2.isEmpty)
    }
    query = row.city match {
      case Some(value) => query.filter(_.city === value)
      case None        => query.filter(_.city.isEmpty)
    }
    query = row.state match {
      case Some(value) => query.filter(_.state === value)
      case None        => query.filter(_.state.isEmpty)
    }
    query = row.postalCode match {
      case Some(value) => query.filter(_.postalCode === value)
      case None        => query.filter(_.postalCode.isEmpty)
    }

    query.result.headOption

Solution 1 - extension method on Query

In this approach, instead of saying filter we can say filterEvenIfNull

  implicit class QueryExtensions[T, R, C[_]](val query: Query[T, R, C]) {

    def filterEvenIfNull[V: BaseTypedType](getCol: T => Rep[Option[V]], targetValue: Option[V]): Query[T, R, C] =
      targetValue match {
        case Some(_) => query.filter((table: T) => getCol(table) === targetValue)
        case None    => query.filter((table: T) => getCol(table).isEmpty)
      }

  }
    addressTable
      .filterEvenIfNull(_.street1, row.street1)
      .filterEvenIfNull(_.street2, row.street2)
      .filterEvenIfNull(_.city, row.city)
      .filterEvenIfNull(_.state, row.state)
      .filterEvenIfNull(_.postalCode, row.postalCode)
      .result
      .headOption

This is much leaner than the verbose workaround, and produces the correct result where we can positively match null columns.

However, I'm not a fan of this because the parameters are (column, value) instead of (expression), so it feels like it doesn't match our expectations from using filter everywhere else.

Solution 2 - extension method on Rep

In this approach, we create a new is operator which can be used inside the familiar .filter expression.

  implicit class RepExtensions[T](val rep: Rep[Option[T]]) {

    def is(targetValue: Option[T])(implicit slickRecognizedType: BaseTypedType[T]): Rep[Option[Boolean]] =
      targetValue match {
        case Some(value) => rep === value
        case None        => rep.isEmpty
      }

  }
    addressTable
      .filter(_.street1 is row.street1)
      .filter(_.street2 is row.street2)
      .filter(_.city is row.city)
      .filter(_.state is row.state)
      .filter(_.postalCode is row.postalCode)
      .result
      .headOption

This restores the typical pattern where we use filter(expression).

I like the name is because it mimics the SQL is null syntax, which reminds the developer why we're using it.

nafg commented 2 years ago

Ok sounds like enough of a consensus. Anyone want to start the ball rolling with a PR?

robstoll commented 2 years ago

As workaround for others, we use the following to address the issue

  implicit final protected class RxSlickRepOption[R](rep: Rep[Option[R]])(implicit tt: BaseTypedType[R]) {

    /** Overwrites slick's behaviour; uses an `is null` check in case the given [[rep]] is None instead of `= null`
      */
    def ===(o: Option[R]): Rep[Option[Boolean]] = o.map(r => rep === r).getOrElse(rep.isEmpty.?)
    def ===(r: R): Rep[Option[Boolean]]         = profile.api.optionColumnExtensionMethods(rep) === r
  }