bitemyapp / esqueleto

New home of Esqueleto, please file issues so we can get things caught up!
BSD 3-Clause "New" or "Revised" License
376 stars 108 forks source link

`on` condition for `leftJoin` should not have `Maybe` in the newly introduced table #349

Open glasserc opened 1 year ago

glasserc commented 1 year ago

Given these models:

User
    name    Text
    age     Int
    deriving Show
Organization
    Id      Text    default=uuid_generate_v1mc()
    name    Text
UserOrganization
    user            UserId
    organization    OrganizationId
    Primary user organization

You can write the following code in Esqueleto

usersWithOrganizations3
  :: SqlPersistT IO [(Entity User, Maybe (Entity UserOrganization))]
usersWithOrganizations3 =
  select $ do
    (u :& uo) <- from $
      table @User
        `leftJoin`
          table @UserOrganization
            `on` \(_ :& uo) -> isNothing (uo ?. #user)    -- [1]
    pure (u, uo)

Note the isNothing on field access of an entity. This corresponds to the (legal) SQL:

SELECT *
FROM "user" u LEFT JOIN user_organization uo ON uo.user IS NULL

I would like to argue that this should be a compiler error. uo.user can never be null.

Of course, the return type of the function is OK -- the result of the join may be a row where the UserOrganization is absent.

However, in the on condition, the user organization cannot be absent. For ALL join types, the on condition is tried against candidate rows from each table. It is impossible to have a non-existent candidate. (I haven't looked at a SQL standard but the documentation for joins in Postgres as well as in Sqlite explain that left joins are implemented in terms of inner joins, with extra rows added afterwards.)

My feeling is that this issue is of low impact. The current behavior is "safe" because it's safe to promote something that is non-nullable into something that is nullable, and while having to handle the Maybe is a little annoying (see e.g. #298), in general we can just wrap whatever other operand we want in the join condition with a Just. It would be nice if the above example didn't compile, but if it didn't, we could probably generate other nonsensical SQL queries using isNothing (just $ uo ^. #user).

On the other hand, fixing the issue might be tricky. Currently we rely on the type produced by the join being concordant with the type of the condition in the on. We don't want to throw that away, since it is only the last table where we may assume the presence of the candidate. Consider this:

SELECT *
FROM "user" u LEFT JOIN user_organization uo ON FALSE 
    LEFT JOIN organization o ON COALESCE(ou.id, 'Engineering') = o.id

Here, use of ou.id in the join condition for the third table is not guaranteed to be present. In Haskell, we can write something like:

usersWithOrganizations3
  :: SqlPersistT IO [(Entity User, Maybe (Entity UserOrganization))]
usersWithOrganizations3 =
  select $ do
    (u :& uo :& _) <- from $
      table @User
        `leftJoin`
          table @UserOrganization
            `on` (\(_ :& uo) -> val False)
        `leftJoin`
          table @Organization
            `on` (\(_ :& uo :& o) -> just (coalesceDefault [uo ?. #organization] (val $ OrganizationKey "Engineering")) ==. o ?. #id)
    pure (u, uo)

Here, the ou in the last on clause is correctly a Maybe.

However, again it would be better if `o` was not a `Maybe`. Then we could write it like this: ``` usersWithOrganizations3 :: SqlPersistT IO [(Entity User, Maybe (Entity UserOrganization))] usersWithOrganizations3 = select $ do (u :& uo :& _) <- from $ table @User `leftJoin` table @UserOrganization `on` (\(_ :& uo) -> val False) `leftJoin` table @Organization `on` (\(_ :& uo :& o) -> coalesceDefault [uo ?. #organization] (val $ OrganizationKey "Engineering") ==. o ^. #id) pure (u, uo) ```

I totally understand if this bug never gets fixed and even if we want to close it but I found it curious and wanted to make sure the justification was written down somewhere.