camertron / scuttle-rb

A library for transforming raw SQL statements into ActiveRecord/Arel queries. Ruby wrapper and tests for scuttle-java.
86 stars 2 forks source link

SQL JOIN with nested subquery/ies #9

Open atrost opened 6 years ago

atrost commented 6 years ago

It seems Scuttle cannot handle queries with a subquery inside a JOIN. The following example uses only one level of nesting:

SELECT c.id, c.name, t.status 
FROM campaigns c                                                         
LEFT OUTER JOIN (                                                           
    SELECT campaigns.id, campaign_approvals.status as status                                 
    FROM campaigns                                                   
    LEFT OUTER JOIN  campaign_approvals on campaign_approvals.campaign_id = campaigns.id                     
    LEFT OUTER JOIN  users on  campaign_approvals.user_id = users.id                             
    WHERE (approval_type = 'publisher_approval' AND (status = 'approved' OR status = 'declined') AND archived IS FALSE)     
 ) t on c.id=t.id                                                        
GROUP BY c.id, t.status  

Scuttle turns this to:

Campaign.select(
  [
    C.arel_table[:id], C.arel_table[:name], T.arel_table[:status]
  ]
).joins(
  Campaign.arel_table.join(User.arel_table, Arel::Nodes::OuterJoin).on(
    CampaignApproval.arel_table[:user_id].eq(User.arel_table[:id])
  ).join_sources
).group(C.arel_table[:id], T.arel_table[:status])

The nested SELECT inside the LEFT OUTER JOIN statement gets ignored..

Scuttle does recognize subquery patterns like this:

SELECT  t.id, t.name, ...  
   FROM ( a.id, a.name,...
    SELECT ...
        FROM sometable a
  ) t
camertron commented 6 years ago

Hey @atrost,

This is another tough one. It looks like activerecord itself doesn't support this very well. Like all things activerecord/arel however, it's just a matter of coming up with the right incantation :)

First of all - as you discovered - this works:

Post.select(Post.where(id: 1))

It generates a subquery inside the FROM clause. However, doing the same thing with #joins does not work. The generated SQL query is missing the JOIN entirely.

Post.joins(Comment.where(id: 1))

We can however construct the join and table alias manually:

t = Arel::Table.new('t')

Post
  .joins(
    Post.arel_table.join(
      Arel::Nodes::TableAlias.new(Comment.where(id: 1).arel, 't'),
      Arel::Nodes::InnerJoin
    ).on(
      t[:post_id].eq(Post.arel_table[:id])
    ).join_sources
  )

Fortunately this works and produces the following SQL query:

SELECT "posts".* FROM "posts"
INNER JOIN (SELECT "comments".* FROM "comments" WHERE "comments"."id" = 1) "t"
ON "t"."post_id" = "posts"."id"

Un fortunately it only seems to work for activerecord 5.2. For versions < 5.2, activerecord appears to leave bind variables for nested subqueries behind (for joins at least), and ends up emitting question marks instead of constants. Eg:

SELECT "posts".* FROM "posts"
INNER JOIN (SELECT "comments".* FROM "comments" WHERE "comments"."id" = ?) "t"
ON "t"."post_id" = "posts"."id"

We can add the bind variables back in by hand, but at this point it really feels like we're fighting against the library. Instead, let's simply convert the subquery into raw SQL and wrap it with parens:

Post
  .joins(
    Post.arel_table.join(
      Arel::Nodes::TableAlias.new(Arel.sql("(#{Comment.where(id: 1).to_sql})"), 't'),
      Arel::Nodes::InnerJoin
    ).on(
      t[:post_id].eq(Post.arel_table[:id])
    ).join_sources
  )

It works! I'll see if I can add this functionality to Scuttle.

atrost commented 6 years ago

Hey @camertron
If you add this functionality to Scuttle I will be happy to try it out. My actual query is much more complicated than the one I posted above so I ended up going the 'find_by_sql' route in this case. Not ideal, but it works. Thanks!