kiranandcode / petrol

Petrol's an OCaml SQL API made to go FAST.
https://kiranandcode.github.io/petrol/petrol/index.html
Other
113 stars 6 forks source link

Example of join #3

Open beajeanm opened 1 year ago

beajeanm commented 1 year ago

Not sure if it's a bug of me just misunderstanding how to use joins.

Assuming the following tables:

let (person, Expr.[id; name]) = StaticSchema.declare_table schema ~name:"person"
Schema.[field "id" ~ty:Type.int; field "name" ~ty:Type.text];;

let (pet, Expr.[pet_id; pet_name; owner_id]) = StaticSchema.declare_table schema ~name:"pet"
Schema.[field "id" ~ty:Type.int; field "name" ~ty:Type.text; field "owner" ~ty:Type.int];;

Then I'd assume I could write a join query as follow:

Query.select ~from:person Expr.[id; name; pet_name]
|> Query.join ~on:Expr.(id = owner_id) (Query.select ~from:pet Expr.[owner_id; pet_name])
|> Format.asprintf "%a" Query.pp;;

This produces:

SELECT person.id, person.name, pet.name
FROM person INNER JOIN (SELECT pet.owner, pet.name
FROM pet) AS join_tmp_0 ON person.id = pet.owner

So the inner select on the pet table has been aliased to join_tmp_0 but the outer select and the on clause still use the pet prefix.

I assume I misundertood something in the join API, but I could find any relevant example.

kiranandcode commented 1 year ago

Ah, no @beajeanm, that is a bug.

I introduced these anonnymous aliases as a quick fix because Postgres complains if all subqueries aren't named, but it seems the change broke queries that should otherwise work.

You can avoid the problem my using Expr.as_ ~name expr to rename your expressions:

let owner_id, owner_id_ref = Expr.(as_ owner_id ~name:"owner_id") in
let pet_name, pet_name_ref = Expr.(as_ pet_name ~name:"pet_name") in
Query.select ~from:person Expr.[id; name; pet_name_ref]
|> Query.join ~on:Expr.(id = owner_id_ref) (Query.select ~from:pet Expr.[owner_id; pet_name])
|> Format.asprintf "%a" Query.pp;;

but evidently, I think it would be better to have a proper implementation of aliasing for tables that takes this into account.

I'll implement this today or tomorrow.

beajeanm commented 1 year ago

Thanks, I'll have a try at that

sam-huckaby commented 1 year ago

I'm wondering if this bug was addressed yet? I'm having sort of the reverse problem, where my subqueries are not being aliased at all and that causes my joins to fail.

I'm using the below table definitions and decoder function.

(* declare a table, returning the table name and fields *)
module Users = struct
let table, Expr.[id ; username ; bio ; display_name] =
  VersionedSchema.declare_table schema ~name:"users"
     Schema.[
        field ~constraints:[primary_key ~auto_increment:true ()] "id" ~ty:Type.int;
        field "username" ~ty:Type.(character_varying 32);
        field "bio" ~ty:Type.(character_varying 32);
        field "display_name" ~ty:Type.(character_varying 32);
     ]
end

(* declare a table, returning the table name and fields *)
module Posts = struct
let table, Expr.[id ; user_id ; message ; created] =
  VersionedSchema.declare_table schema ~name:"posts"
     Schema.[
      field ~constraints:[primary_key ~auto_increment:true () ; not_null ()] "id" ~ty:Type.big_int;
      field ~constraints:[foreign_key ~table:Users.table ~columns:Expr.[Users.id] ()] "user_id" ~ty:Type.int ;
      field "message" ~ty:(Type.character_varying 140) ;
      field "created" ~ty:Type.time ;
     ]
end

type post_result = {
  id : int64 ;
  message : string ;
  username : string ;
  display_name : string ;
  created : Ptime.t ;
}

let decode
      (id,
       (message,
        (username,
         (display_name,
          (created, ()))))) = {
    id = id ;
    message ;
    username ;
    display_name ;
    created ;
  }

let fetch_posts db =
  let users = Query.select [Users.id ; Users.username ; Users.display_name] ~from:Users.table in
  let posts = Query.select [Posts.id ; Posts.message ; Users.username ; Users.display_name ; Posts.created] ~from:Posts.table in
  let on = Expr.(Users.id = Posts.user_id) in
  Query.join ~op:INNER ~on users posts
  |> Request.make_many
  |> Petrol.collect_list db
  |> Lwt_result.map (List.map decode)

The fetch_posts function generates SQL which looks like the following:

SELECT
    posts.id,
    posts.message,
    users.username,
    users.display_name,
    posts.created
FROM posts
INNER JOIN (
    SELECT
        users.id,
        users.username,
        users.display_name
    FROM users
)
ON (users.id) = (posts.user_id)

Which lacks an alias for the subquery generated for the inner join. If you point me in the right direction, I could try to work on a fix. I really appreciate your work on this library, it's made OCaml a lot of fun to learn.

beajeanm commented 1 year ago

@sam-huckaby which version are you using?

Trying the code you provided on my machine (with petrol 1.2.0) I get the same behaviour as my example (an alias to join_tmp_0) and the provided work-around works.

Since this ticket is still open, you have to assume this is the recommended way to do join queries for now 🙂

sam-huckaby commented 1 year ago

Interesting! I am also using 1.2.0, but I'm not getting an alias (or at least the error I'm getting back from PostgreSQL indicates that it is missing. I think this question would be better suited to a separate issue though, since I originally misunderstood the workaround being suggested here. sorry to clutter things up.

beajeanm commented 1 year ago

or at least the error I'm getting back from PostgreSQL indicates that it is missing.

That why in the example above we've printed the produce SQL rather than trying to understand what went wrong based on the DB error message.

since I originally misunderstood the workaround being suggested here.

The current version use the pattern Table.field in the select clause, but Table is only mentioned in the sub-query of the join, so it's not a valid identifier in the top level. The current solution is to manually alias the fields used in the subquery so they can be re-used in the select clause.

So

let owner_id, owner_id_ref = Expr.(as_ owner_id ~name:"owner_id") in
let pet_name, pet_name_ref = Expr.(as_ pet_name ~name:"pet_name") in
Query.select ~from:person Expr.[id; name; pet_name_ref]
|> Query.join ~on:Expr.(id = owner_id_ref) (Query.select ~from:pet Expr.[owner_id; pet_name])
|> Format.asprintf "%a" Query.pp;;

Will geneate the following SQL.

SELECT person.id, person.name, pet_name
FROM person INNER JOIN (SELECT pet.owner AS owner_id, pet.name AS pet_name
FROM pet) AS join_tmp_0 ON person.id = owner_id

As you can see when you compare that to the first SQL generated, each field in the sub select is aliased (SELECT pet.owner AS owner_id, pet.name AS pet_name). So these aliases become valid fields for the top level query.