mislav / will_paginate

Pagination library for Rails and other Ruby applications
http://github.com/mislav/will_paginate/wikis
MIT License
5.7k stars 864 forks source link

count(*) problem with postgres. #45

Open junaid opened 14 years ago

junaid commented 14 years ago

This plugin don't works with postgres for some queries that have distinct and selecting more than one column because Postgres follow sql standard strictly and mysql added few things in sql. If you change the count query using subquery then it will work fine for both of them mysql and postgers. Now plugin do following things in case of mysql and postgres.Here is example.

Mysql :: Regular query :: SELECT DISTINCT tenders.* FROM tenders INNER JOIN assignments ON tenders.id = assignments.assignable_id AND assignments.assignable_type = 'Tender'

Count query :: SELECT count(DISTINCT tenders.id) AS count_distinct_tenders_id FROM tenders INNER JOIN assignments ON tenders.id = assignments.assignable_id AND assignments.assignable_type = 'Tender'

Postgres :: Regular query :: SELECT DISTINCT "tenders".* FROM "tenders" INNER JOIN "assignments" ON "tenders".id = "assignments".assignable_id AND "assignments".assignable_type = E'Tender'

Count query :: SELECT count(DISTINCT "tenders".*) AS count_distinct_tenders_all FROM "tenders" INNER JOIN "assignments" ON "tenders".id = "assignments".assignable_id AND "assignments".assignable_type = E'Tender'

This count query of postgres gives following error "PGError: ERROR: could not identify an equality operator for type tenders". As per postgres documentation you can only give one expression to count function which is sql standard but mysql allows more than one expressions in count(*) function. If you transform this query using this then it will work for both.

SELECT count() from( SELECT DISTINCT "tenders". FROM "tenders" INNER JOIN "assignments" ON "tenders".id = "assignments".assignable_id AND "assignments".assignable_type = E'Tender' ) as temp

I also tried new count query way with sqlite and it worked so we are safe to implement for sqlite,mysql and postgres.

Thanks Junaid.

mislav commented 14 years ago

Thanks for detail analysis for this issue. I knew about it before, but I needed someone to dissect it for me. Also, I never really had the motivation to fix it inside will_paginate since it's clearly an ActiveRecord issue. Putting too much assumptions about SQL inside will_paginate is not where I want to go.

Can you check how ActiveRecord 3.0 handles these cases? Because it's built on Arel now, so it must understand relational logic better. I'm interested if it's possible to build a scope with "distinct" (like here) and then call count and perform a limited find all without any special parameters or workarounds.

sphogan commented 14 years ago

I can comment on this and potentially offer a patch (or the start of a patch).

I had this same issue with Microsoft's SQL Server (yea, yea, I know, but I have to for work). As junaid pointed out, will_paginate re-writes the query for MySQL so that it becomes DISTINCT(table_name.id). It doesn't do that for other databases and I think I found the issue.

MySQL uses the backtick () to denote table names and such. SQL Server, on the other hand, uses "[" and "]" to enclose things and it looks (from junaid's post) that PostgreSQL uses standard double-quotes. In lib/will_paginate/finder.rb (line 201 in version 2.3.12), there's a gsub that removes the backticks from the select part of the sql statement and (if a substitution was performed), it goes on to change the select part of it to beDISTINCT #{klass.table_name}.#{klass.primary_key}`. So, if it doesn't detect any backticks in the original select, then it doesn't replace the select part with a count of the primary key.

This was easily fixed (for SQL Server) by changing the gsub to a regexp including "[" and "]" as well as the backtick (so it's now if options[:select].gsub(/[\[\]\]/, '') =~ /\w+.*/`). I might have to repost that last part since I'm not sure how the escaping will work. Anyway, that line could easily be altered to include the double-quote that PostgreSQL is using to enclose table names and therefore make will_paginate work on PostgreSQL as well.

EDIT: please ignore the backslash before the backtick in the gsub in the last code snippet. If I remove the backslash, it ends code mode at that backtick, but the backslash should be thought of as an escape character there. I'm not completely sure what's going on, but the square brackets need a backslash before them in the regexp, but the backtick shouldn't have one before it.

mislav commented 14 years ago

I'm really dying here of need for a proper, tested patch … You guys did great investigative work, that could maybe continue in form of a pull request? ;) Maybe?

kjg commented 14 years ago

I think 470dae1da39ffa4c70237ed9455f0051d3814156 fixes this one too

junaid commented 14 years ago

thanks kjg, i will check this and update this ticket.

house9 commented 13 years ago

this issue looks quite old, but seems to still exist I am using will_paginate 3.0.2 with postgres and rails 3.1, it seems to ignore DISTINCT

Table1.joins("LEFT OUTER JOIN table2 ON ...").paginate(:page => @params[:page]) pagination: SELECT COUNT(*) FROM "table1" LEFT OUTER ... real query: SELECT DISTINCT table1.*, table2.* FROM "table1" LEFT OUTER JOIN table2...

so the pagination is coming up with more records than my DISTINCT query does, the fix from kjg looked like it should do what I want, but the code base no longer has a finder.rb file, I guess I'll take a look at lib/will_paginate/active_record.rb

house9 commented 13 years ago

just came up with a dirty workaround, might help someone else

query = Foo.joins("LEFT OUTER JOIN bars ON bars.id = foos.id").where(etc....).select("DISTINCT foos.*, bars.*") Foo.paginate_by_sql(query.to_sql, :page => @params[:page])

the problem does not exist in paginate_by_sql

dima4p commented 13 years ago

Maybe something like this will help?: "select count(*) from (#{collection.to_sql}) count_sql"

ted-collins commented 12 years ago

house9 -- many thanks, this solved it (at least until a real fix is in place -- I suspect this is going to get pokey on large datasets).

clifton commented 12 years ago

@mislav would you accept a pull request for this?

https://github.com/mislav/will_paginate/blob/master/lib/will_paginate/active_record.rb#L67

rel = rel.apply_finder_options(@wp_count_options) if defined? @wp_count_options
rel.count

i believe #apply_finder_options overwrites any :select option sent to it when rel.count runs, so perhaps wp_count_options should accept :distinct => true and have that passed along such that rel.count :distinct => true

JRhyne commented 12 years ago

Bump

kidlab commented 11 years ago

@mislav Can you please update the document how to use this new option? It seem that @clifton's patch is merged to the lasted release, but I don't know how to use it. Thanks.