camertron / scuttle-rb

A library for transforming raw SQL statements into ActiveRecord/Arel queries. Ruby wrapper and tests for scuttle-java.
86 stars 2 forks source link

Aggregate function with FILTER #8

Closed atrost closed 6 years ago

atrost commented 6 years ago

Hi @camertron

I've been trying to translate the following SQL that uses an array_aggr function with a FILTER option but Scuttle interprets filter as an alias.. The FILTER option/clause extends aggregate functions by an additional WHERE clause so that the result of the aggregate function is built from only the rows that satisfy the additional filter clause. I figured out it's quite powerful as I was able to reduce my initial query that relied on subqueries to a one plain single select statement.

A simplified and working version (Postgresql) of my aggregate query goes like this:

  SELECT DISTINCT campaigns.id as order_id, 
  array_agg( salespersons.first_name) FILTER (WHERE campaign_market_sales_reps.is_primary IS FALSE) as arr_agg_with_filter 
  FROM campaigns
    LEFT OUTER JOIN campaign_markets on campaigns.id = campaign_markets.campaign_id
    LEFT OUTER JOIN campaign_market_sales_reps on campaign_market_sales_reps.campaign_market_id = campaign_markets.id   
    LEFT OUTER JOIN salespersons on salespersons.id = campaign_market_sales_reps.salesperson_id   
    GROUP BY order_id
 ORDER BY order_id; 

Scuttle translates this to (omitted irrelevant part for brevity):

Campaign.select(
  [
    Campaign.arel_table[:id].as('order_id'), Arel::Nodes::NamedFunction.new(
      'array_agg', [Salesperson.arel_table[:first_name]]
    ).as('FILTER')
  ]
).joins( ... )

Scuttle seems to handle nested aggregate function well but the FILTER clause breaks things. My actual query is this:

SELECT DISTINCT campaigns.id as order_id,
            array_to_string( array_agg( FORMAT('%s,%s',salespersons.first_name, salespersons.last_name)) FILTER( WHERE campaign_market_sales_reps.is_primary IS FALSE), ',') as arr_agg_with_filter     
  FROM campaigns
    LEFT OUTER JOIN campaign_markets on campaigns.id = campaign_markets.campaign_id
    LEFT OUTER JOIN campaign_market_sales_reps on campaign_market_sales_reps.campaign_market_id = campaign_markets.id   
    LEFT OUTER JOIN salespersons on salespersons.id = campaign_market_sales_reps.salesperson_id   
    GROUP BY order_id
    HAVING  ( array_to_string( array_agg( FORMAT('%s,%s',salespersons.first_name, salespersons.last_name)) FILTER( WHERE campaign_market_sales_reps.is_primary IS FALSE), ',') ) ILIKE '%Tim%' 
 ORDER BY order_id; 

Will appreciate any help in translating this to AREL.

camertron commented 6 years ago

Hey there @atrost,

Ok, so I did some playing around and unfortunately it doesn't look like arel is going to be able to handle your query. There was an issue filed against arel (before it moved into rails) by someone offering to add FILTER support, but it doesn't look like a companion issue was filed after the move. I also grepped through the arel source code and couldn't find an existing implementation :(

However! I assume if you have array_agg available then you also have array_remove. Scuttle (sort of) supports CASE statements, so this works:

 SELECT DISTINCT campaigns.id as order_id, 
 array_remove(array_agg(CASE WHEN campaign_market_sales_reps.is_primary IS FALSE THEN salespersons.first_name ELSE NULL END), NULL)
 FROM campaigns
   LEFT OUTER JOIN campaign_markets on campaigns.id = campaign_markets.campaign_id
   LEFT OUTER JOIN campaign_market_sales_reps on campaign_market_sales_reps.campaign_market_id = campaign_markets.id   
   LEFT OUTER JOIN salespersons on salespersons.id = campaign_market_sales_reps.salesperson_id   
   GROUP BY order_id
ORDER BY order_id; 

Scuttle turns this into an expression with some literal SQL:

Campaign.select(
  [
    Campaign[:id].as('order_id'), Arel::Nodes::NamedFunction.new(
      'array_remove', [
        Arel::Nodes::NamedFunction.new(
          'array_agg', [
            Arel.sql(
              'CASE WHEN campaign_market_sales_reps.is_primary IS FALSE THEN salespersons.first_name ELSE NULL END'
            )
          ]
        )
      ]
    )
  ]
)

Is it ideal? No. Does it work? Probably.

atrost commented 6 years ago

Thanks for looking into it @camertron ! That should work though it gets more tricky if I need to trow in additional named functions into the equation namely array_to_string and format. I ended up, similar to your example, with a hybrid version using Arel with some literal SQL but still utilizing the FILTER WHEN clause and that works just fine.

camertron commented 6 years ago

Awesome, glad you got it working :)

Your issue prompted me to look into adding proper case statement handling in Scuttle, so be on the lookout for that in the next few days.

atrost commented 6 years ago

Thanks. This's probably straightforward but here is the snippet of my literal SQL for the aggregation part that I mentioned. Just thought to share:


Campaign.arel_table[:id].as('order_id'), 
   Arel.sql(%(
      array_to_string( 
         array_agg( 
             format('%s,%s',salespersons.first_name,salespersons.last_name)) 
                filter( WHERE campaign_market_sales_reps.is_primary IS FALSE
         ), 
      ','))
    ).as('sp') ```