kohana / orm

Kohana ORM
159 stars 108 forks source link

yet another count_all() bug #128

Open djfd opened 8 years ago

djfd commented 8 years ago

Hi,

having the situation of selecting users having 'admin' role and any of some predefined roles set one can use, for example, next sequence

$admins = ORM::factory('Role')->where('name', '=', 'admin')->find()->customers;
// $rids is an array of predefined roles IDs
$admins
    ->distinct(TRUE)
    ->join('customer_role', 'roles_list')->on('roles_list.customer_id', '=', 'user.id')
    ->where('roles_list.role_id', 'in', $rids);

this produces absolutely perfectly valid SQL:

SELECT DISTINCT `user`.*
FROM `customer` AS `user`
JOIN `customer_role` ON (`customer_role`.`customer_id` = `user`.`id`)
JOIN `customer_role` AS `roles_list` ON (`roles_list`.`customer_id` = `user`.`id`)
WHERE `customer_role`.`role_id` = '2' AND `roles_list`.`role_id` IN (2, 9, 1, 8)

However, if we try to count such admins using $admins->count_all() this is rendered to wrong sQL like this:

SELECT DISTINCT COUNT(`user`.`id`) AS `records_found`
FROM `customer` AS `user`
JOIN `customer_role` ON (`customer_role`.`customer_id` = `user`.`id`)
JOIN `customer_role` AS `roles_list` ON (`roles_list`.`customer_id` = `user`.`id`)
WHERE `customer_role`.`role_id` = '2' AND `roles_list`.`role_id` IN (2, 9, 1, 8)

Which in turn gives wrong count. It easy to see that we need to have

SELECT COUNT(DISTINCTuser.id) instead of SELECT DISTINCT COUNT(user.id)

see https://github.com/kohana/orm/blob/3.3/master/classes/Kohana/ORM.php#L1621 particularly starting from line 1644 (https://github.com/kohana/orm/blob/3.3/master/classes/Kohana/ORM.php#L1644)

Thanks

acoulton commented 8 years ago

@djfd thanks for reporting this. Unfortunately I don't think any of the main Kohana contributors are still using the ORM library - there are various issues with it and I think everyone would recommend using something else (Doctrine 2, for example) if you have the choice.

If you are locked in, then we'd appreciate a pull request to fix this.

From a quick look it looks like the actual DISTINCT is prepended within the kohana/database query builder - the simplest fix may be to check for and temporarily remove a pending distinct from db_pending (the same way selects are removed at the top of ::count_all()) and then add it to select yourself in Line 1647

I've not tested that - if you can have a go and see if it resolves your issue without creating others then please do send it as a pull request for 3.3/develop. Otherwise someone will try to get round to this when they can, but it may take a while. There are not many of us, so we're having to prioritising the little time we have.

djfd commented 8 years ago

@acoulton

yeah, your approach looks correct, and it will work fine: remove distinct from pending, add it after count in line 1647, then restore.

unfortunately I am not familiar with pull requesting routine so I just cannot do that.

acoulton commented 8 years ago

@djfd then now's a good time to learn :grin: - it's really very easy. There's a decent beginners guide at https://akrabat.com/the-beginners-guide-to-contributing-to-a-github-project/ - the only difference is we currently use the 3.3/develop branch instead of master so once you have a local copy run:

$ git checkout 3.3/develop
$ git pull upstream 3.3/develop && git push origin 3.3/develop
$ git checkout -b 3.3/bug/count-all-distinct

and when you create your pull request set 3.3/develop as the target branch.

If that's too much, then since your changes are just in one file you could make your changes and test them locally, then use the edit (pencil) icon on https://github.com/kohana/orm/edit/3.3/develop/classes/Kohana/ORM.php to manually transfer them and create a pull request. If you do that be very careful you copy them across correctly and review the comparison to be sure you've not made any unwanted changes.

djfd commented 8 years ago

@acoulton you are tech writing genius, really no kidding. Your three lines of commands is that I was looking for, you saved my life, thanks !

As for the the issue itself, it seems more correct to improve query builder to have that count_all() operation: 1) it knows/uses underlying DB driver thus knows how and where to insert 'DISTINCT' keyword, while ORM level class/model should not use even 'COUNT()' function (can we be sure that RDBMS uses count() as its [best/fastest/at all] counting function?) 2) it works with a model tightly so certainly knows that for a counting we need use primary key and do not use its usual columns thus capable to make these changes on its own transparently for a model, do calculation, then return the things back

What do you think?