slick / slick

Slick (Scala Language Integrated Connection Kit) is a modern database query and access library for Scala
https://scala-slick.org/
BSD 2-Clause "Simplified" License
2.65k stars 609 forks source link

Non valid select generated with union followed by filter and used in joins #2116

Open tg44 opened 4 years ago

tg44 commented 4 years ago

DESCRIPTION:

I do a union, a filter on that union, and a join. At this point, I get an sql, which tries to use a field which is not declared previously.

I'm not a pro at slick internals, the example code is the minimum where I could reproduce the problem, probably it could be pinpointed better.

STEPS TO REPRODUCE:

import $ivy.`ch.qos.logback:logback-classic:1.2.3`
import $ivy.`com.typesafe.slick::slick:3.3.2`

object TestSlick {
  import slick.jdbc.PostgresProfile.api._

  case class DBO1(id: Long, a: Long, b: Long)
  class DBO1Table(tag: Tag) extends Table[DBO1](tag, "dbo1") {
    def id: Rep[Long] = column[Long]("id", O.PrimaryKey, O.AutoInc)
    def a: Rep[Long] = column[Long]("a")
    def b: Rep[Long] = column[Long]("b")

    def * = (id, a, b) <> ((DBO1.apply _).tupled, DBO1.unapply)
  }

  val db1s = TableQuery[DBO1Table]

  val query1 = db1s.map(r => r.id -> r.a)
  val query2 = db1s.map(r => r.id -> r.b)

  val query3 = (query1 union query2).distinct.groupBy(_._1).map { case (gId, r) => gId -> r.length }
  val interestingIds = query3.filter(_._2 >= 5).map(_._1)

  val query5 =
    for {
      interestingId1 <- interestingIds
      d1 <- db1s
      if d1.id === interestingId1
    } yield (d1.id)

  val statement = query5.result.statements.head
}

println(TestSlick.statement)

Could be runned with pbpaste > slickTest.sc && amm slickTest.sc if you have ammonite installed.

EXPECTED RESULT:

A working sql.

ACTUAL RESULT:

select s59."id"
from (select s77.s75 as s51, count(1) as s52
      from (select distinct on (s73.s45, s73.s46) s73.s45 as s75, s73.s46 as s76
            from ((select "id" as s45, "a" as s46 from "dbo1") union (select "id" as s45, "b" as s46 from "dbo1")) s73) s77
      group by s77.s75) s58,
     "dbo1" s59
where (s58.s52 >= 5)
  and (s59."id" = s58.s53)

Here the s59.id = s58.s53 is wrong. It should be s59."id" = s58.s51.

SETUP:

You only need ammonite installed, and run the script in the first codeblock. No need for db, or tables, or anything.

hvesalai commented 4 years ago

I suggest to work around by refactoring the query.

What are you trying to achieve with

  val query5 =
    for {
      interestingId1 <- interestingIds
      d1 <- db1s
      if d1.id === interestingId1
    } yield (d1.id)
tg44 commented 4 years ago

I don't really want to tell the whole usecase, bcs its rather large (I count group interactions). The shown part is just the beginning of a much larger query (where I filter out the groups with too few interacters to preserv anonimity). (Ofc I gather the data across multiple tables with much complex queries, not just from one as the example shows.) On the working app the subquery in question seems like:

val interestingGroupPairs =
      for {
        group1 <- groupRepository.groups
        group2 <- groupRepository.groups
        if group1.id < group2.id
        if group1.id inSetBind interestingGroupIds 
        //this would be sth like interestingId1 <- interestingIds and if group1.id = interestingId1
        if group2.id inSetBind interestingGroupIds
      } yield (group1, group2)

That is bad that I need to run separated querries and inset-bind twice just for a "combine join" (it works for now, but only because I have a "small" amount of groups.). Also it was a bit sad moment when I realized that the same query went into the output sql twice (instead of creating a common table expression)...

I think if the data and the use-case goes larger I will need to rewrite this to native sql anyway, based on the quality/size of the generated sql vs the hand written one :(

But all of the above is totally unrelated of the bug I found reproduced, and minimalized ;)

DevNico commented 7 months ago

Just stumbled upon the same issue. We have multiple selects from the same table that are each joined to different other tables and combined them using union. If we leave it at that everything works but when adding additional filters on that unionised query the generated sql has a mismatch in the generated aliases for each subquery.

nafg commented 7 months ago

What version of Slick did you test?

DevNico commented 7 months ago

This was reproducible in 3.5.1 (we're also using slickpg 0.22.1). Another (maybe) interesting point is that one union worked but from 2 to 5 in our case it seems to be broken.

Interestingly I just found a workaround by calling .map(identity) between the last .union and .filter.