datamapper / dm-aggregates

DataMapper plugin providing support for aggregates on collections
http://datamapper.org/
MIT License
16 stars 15 forks source link

Can't order by aggregated fields #10

Closed solnic closed 13 years ago

solnic commented 13 years ago

I'm trying to use the aggregate method and sort the results by a summed field, but it doesn't work.

For example, given the following model, I want to select the top 5 user_ids, by sum of quantity (descending):

class Gift
  include DataMapper::Resource

  belongs_to :user

  property :id, Integer, :serial => true
  property :user_id, Integer, :nullable => false, :index => true
  property :quantity, Integer, :nullable => false, :default => 1
end

This gives me 5 user_ids ordered by the single largest quantity (not quite right):

Gift.aggregate(:fields => [:user_id, :quantity.sum], :limit => 5, :order => [:quantity.desc])
SELECT `user_id`, SUM(`quantity`) FROM `gifts` WHERE (`deleted_at` IS NULL) GROUP BY `user_id` ORDER BY `quantity` DESC LIMIT 5

What I'm looking for is something that will generate the query:

SELECT `user_id`, SUM(`quantity`) FROM `gifts` WHERE (`deleted_at` IS NULL) GROUP BY `user_id` ORDER BY SUM(`quantity`) DESC LIMIT 5

...the most logical syntax for which would be something like:

Gift.aggregate(:fields => [:user_id, :quantity.sum], :limit => 5, :order => [:quantity.sum.desc])

Created by Erik Michaels-Ober - 2009-01-27 21:34:31 UTC

Original Lighthouse ticket: http://datamapper.lighthouseapp.com/projects/20609/tickets/789

solnic commented 13 years ago

Currently DM::Query::Operator does not allow chaining like this.

I’ve been thinking that the DM specific functions probably shouldn’t be DM::Query::Operator, and should perhaps be DM::Query::Function instances, which could inherit from DM::Query::Operator and provide chaining.

Michael, what do you think?

by Dan Kubb (dkubb)

solnic commented 13 years ago

Yes, we need to split/extend Query and Operator in near future so not only this request is satisfied, but also search and raw SQL specific limitations we discussed already. I think Function is a good idea, however, it is a bit magical, and we have to be very careful with documentation and implementation.

by Michael Klishin (antares)

solnic commented 13 years ago

I have a patch I can upload that allows the following:

Contact.(:order=>[:first_name.desc(’lower(first_name)’)])

which will produce the following sql:

SELECT "id", "account_id", "first_name", "last_name" FROM "contacts" ORDER BY lower(first_name) DESC

I agree that DM::Query::Function instances would be cool, but it’s a lot to ask of DM to be all things to all people, and for edge cases, allowing people to drop to raw text is better then monkey patching the library!

My patch basically adds a "raw" attribute to Operator and Direction. The above use is a simple example, but in my app I need to order by the text ’(lower(trim(last_name || first_name)))’ and this patch does the trick. IMHO, anyone not using mysql is going to have a hard time not being able to use their functional indexes for sorting.

btw, I starting hacking away this afternoon on porting a rails app to Sinatra/DataMapper and this was the first show-stopper. I’m happy to see a recent ticket dealing with the same issue, but I’m dead tired, so after a few zz’s I’ll attach the patch to this ticket if your interested.

by Troy K

solnic commented 13 years ago

I have a patch I can upload

Please do. Can you also drop by #datamapper at freenode some time this weekend (except for evenings)? I’d like to give you some feedback then.

by Michael Klishin (antares)

solnic commented 13 years ago

@antares Decided a different route, I’ll try to ping you at #datamapper this weekend. http://github.com/troyk/dm-core/commit/345e204d1acfe2b1f52843d0a2e7305ab434a105 is where I’m at so far. It is not a perfect fit, but it scratches my itch nicely for getting dm to work with any realistic postgres or oracle backend. It allows:

schema (copied from a rails migration -- sorry):

class CreateContacts < ActiveRecord::Migration def self.up create_table :contacts, :options=>’WITH (fillfactor=70)’ do |t| t.column :account_id, ’int not null references accounts(id) on delete cascade’ t.column :type, "varchar(30) not null" t.column :created_at, "timestamp with time zone not null" t.column :updated_at, "timestamp with time zone not null" t.column :company_id, ’int’ t.column :people_count, ’int not null default 0’ t.column :first_name, "varchar(60) not null default ’’" t.column :last_name, "varchar(120) not null default ’’" t.column :title, ’varchar(80)’ t.column :avatar_url, ’varchar(180)’ t.column :tags, ’text[]’ t.column :references, ’text[]’ t.column :acl, ’int[] not null’ end execute ’create index contacts_name_ordered_idx on contacts (lower(trim(last_name || first_name)))’ execute ’create index contacts_acl_gin_idx on contacts using gin (acl)’ end

def self.down drop_table :contacts end end

dm syntatical love:

Contact.all(:order=>[:last_name.func(’(lower(trim(%s || first_name)))’)]) statement: SELECT "id", "account_id", "first_name", "last_name" FROM "contacts" ORDER BY (lower(trim("last_name" || first_name)))

Contact.all(:order=>[:last_name.desc.func(’(lower(trim(%s || first_name)))’)]) statement: SELECT "id", "account_id", "first_name", "last_name" FROM "contacts" ORDER BY (lower(trim("last_name" || first_name))) DESC

Contact.all(:first_name.like.lower=>’tr%’,:order=>[:last_name.func(’(lower(trim(%s || first_name)))’)]) statement: SELECT "id", "account_id", "first_name", "last_name" FROM "contacts" WHERE (lower("first_name") LIKE lower(’tr%’)) ORDER BY (lower(trim("last_name" || first_name)))

by Troy K

solnic commented 13 years ago

Looks good to me initially, need to check up how much it complicates query implementation

by Michael Klishin (antares)

solnic commented 13 years ago

I recently started dm-types spec suite rework. I will get to this ticket (and dm-aggregates spec suite rework) once that is done, and dm-validations gets autovalidation hooks for types, because every time i see a type author uses aliasing to hook into validation inferring code, I imagine 10 plugins doing the same thing, and I die a little inside.

So you have half a week or so before I really can have an in-depth look at this patch. Thank you anyway :)

by Michael Klishin (antares)

solnic commented 13 years ago

It’s really more of a proof of concept at this point, I’d be interested in hearing other ideas as they come to you. I’m just evaluating using dm over AR and this was the first time I’ve looked at the dm code in awhile. In reality functions are going to be adaptor specific, so maybe the specific adaptor should override the statement generators and just call super if the operator is not recognized.

by Troy K

solnic commented 13 years ago

yea, it would be great for this to be built.

by Phil Wang

solnic commented 13 years ago

I’m going to change this to a suggestion for now. No progress has been made on it in several months.

There are too many things that would need to change internally to allow this to work with anything more than an RDBMS system, and adding RDBMS-only behaviour to the core system is not an option.

However, I do have some ideas in mind that would allow this to happen after DM 1.0 is released; but it will require a complete rewrite of the Query/Adapter layer (for other reasons as well).

by Dan Kubb (dkubb)