camertron / scuttle-rb

A library for transforming raw SQL statements into ActiveRecord/Arel queries. Ruby wrapper and tests for scuttle-java.
86 stars 2 forks source link

SQL with correlated subquery #13

Closed roscom closed 3 years ago

roscom commented 3 years ago

I have the sql worked out and it works correctly in practice. Using scuttle to convert to Arel format introduces a problem. It would appear that the correlated table in the subquery does not reference the outer query. My schema looks like this:

            **roles**
            /           \
**domains**(role_id)    **client_blog_allocations**(blog_id, allocation_id)
     /                                     /                                 \
    /                               **blogs**              **allocations**
   /                                    /
**posts**(blogs_id, domain_id)

The working sql with subquery highlighted:

select roles.std_account,
       blogs.url,
       allocations.name,
       count(posts.id),
       max(posts.posted_on),
       max(client_blog_allocations.allocated_on)
from roles
inner join client_blog_allocations on roles.id = client_blog_allocations.role_id
inner join blogs on blogs.id = client_blog_allocations.blog_id 
inner join allocations on allocations.id = client_blog_allocations.allocation_id
inner join posts on posts.blog_id = blogs.id
where posts.domain_id in **(select id from domains where role_id = roles.id)**
  and roles.expired_on is null
  and roles.role_type_id = 2
  and blogs.state in ('deployed','suspended')
  and posts.type = 'LinkPost'
  and posts.state = 'published'
group by roles.std_account,
         blogs.url,
         allocations.name
order by roles.std_account,
         allocations.name,
         blogs.url;

The generated Arel "equivalent", again with the correlated sub query highlighted

Role.select(
        [
          Role.arel_table[:std_account], Blog.arel_table[:url], Allocation.arel_table[:name], Post.arel_table[:id].count, Post.arel_table[:posted_on].maximum, ClientBlogAllocation.arel_table[:allocated_on].maximum
        ]
      ).**where(
        Post.arel_table[:domain_id].in(
          Domain.select(:id).where(
            Domain.arel_table[:role_id].eq(Role.arel_table[:id])**
          )   
        ).and(
          Role.arel_table[:expired_on].eq(nil).and(
            Role.arel_table[:role_type_id].eq(2).and(
              Blog.arel_table[:state].in(['deployed', 'suspended']).and(
                Post.arel_table[:type].eq('LinkPost').and(Post.arel_table[:state].eq('published'))
              )   
            )   
          )   
        )
      ).joins(
        Role.arel_table.join(ClientBlogAllocation.arel_table).on(
          Role.arel_table[:id].eq(ClientBlogAllocation.arel_table[:role_id])
        ).join_sources
      ).joins(
        Role.arel_table.join(Blog.arel_table).on(
          Blog.arel_table[:id].eq(ClientBlogAllocation.arel_table[:blog_id])
        ).join_sources
      ).joins(
        Role.arel_table.join(Allocation.arel_table).on(
          Allocation.arel_table[:id].eq(ClientBlogAllocation.arel_table[:allocation_id])
        ).join_sources
      ).joins(
        Role.arel_table.join(Post.arel_table).on(
          Post.arel_table[:blog_id].eq(Blog.arel_table[:id])
        ).join_sources
      ).order(
        Role.arel_table[:std_account], Allocation.arel_table[:name], Blog.arel_table[:url]
      ).group(
        Role.arel_table[:std_account], Blog.arel_table[:url], Allocation.arel_table[:name]
      )

resulting in: ActiveRecord::StatementInvalid (Mysql2::Error: Unknown column 'roles.id' in 'where clause') which refers to the correlated subquery.

Would you be able to point me in the right direction regards this? ~ Ross

camertron commented 3 years ago

Hey @roscom, it looks like you've encountered a bug in Scuttle. If I recall correctly, Scuttle used to handle nested subqueries like this correctly, but something has probably changed in recent versions of activerecord/arel. I played around in the console for a while and got it to work. The key is using an arel table and the project method when constructing the subquery:

Post.arel_table[:domain_id].in(
  Domain.arel_table.project(:id).where(
    Domain.arel_table[:role_id].eq(Role.arel_table[:id])
  )
)

Hope that helps :)