hyperloop-rails / powerstation

A Tool for Detecting Performance Bugs in Rails Applications
https://hyperloop-rails.github.io/
BSD 2-Clause "Simplified" License
57 stars 5 forks source link

Still relevant to check `any?` & `where.first` ? #2

Closed itsderek23 closed 6 years ago

itsderek23 commented 6 years ago

I believe more current versions of ActiveRecord address this. Notice that AR is using LIMIT 1 and isn't loading the result set into memory.

irb(main):007:0> User.where(email: 'email').exists?
  User Exists (1.0ms)  SELECT  1 AS one FROM `users` WHERE `users`.`email` = 'email' LIMIT 1
=> false
irb(main):008:0> User.where(email: 'email').any?
  User Exists (0.4ms)  SELECT  1 AS one FROM `users` WHERE `users`.`email` = 'email' LIMIT 1
=> false
irb(main):009:0> User.find_by(email: 'email')
  User Load (0.4ms)  SELECT  `users`.* FROM `users` WHERE `users`.`email` = 'email' LIMIT 1
=> nil
irb(main):010:0> User.where(email: 'email').first
  User Load (0.5ms)  SELECT  `users`.* FROM `users` WHERE `users`.`email` = 'email' ORDER BY `users`.`id` ASC LIMIT 1
=> nil

Credit to @mccallumjack.

graingert commented 6 years ago

@mccallumjack It even documents using .any? and .many? as alternatives http://guides.rubyonrails.org/active_record_querying.html#existence-of-objects

Here's the new code: https://github.com/rails/rails/blob/5a1ea09062eaed78e21253a128d433a1beb745ad/activerecord/lib/active_record/relation.rb#L246-L262

graingert commented 6 years ago

The fix was added 3 years ago! https://github.com/rails/rails/commit/b644964b2b555798fc4b94d384b98438db863b3f

hyperloop-rails commented 6 years ago

Yes, these make sense.

Alghough any? under the current ActiveRecord doesn't issue the select count query, there occur some same performance issues as developers may use xx.count > 0 to check the existence, which will cause the inefficiency.

As a issue in lobters illustrates this situation: https://github.com/lobsters/lobsters/pull/398

graingert commented 6 years ago

Sorry I don't understand what you're saying

hyperloop-rails commented 6 years ago

I understand your concern is that any? in current ActiveRecord API does the same thing as exists?, the thing is that there is some cases that users will directly use xx.count > 0 to achieve the same functionality as exists?, this is for sure inefficient.

graingert commented 6 years ago

Ok great, not sure how that's relevent though

graingert commented 6 years ago

Obviously there should be exactly one way of solving every problem with any library. Clearly the docs should not say:

You can also use any? and many? to check for existence on a model or relation.

Buy instead say:

any? and many? used to issue a very inneficent query, however these methods have recently been optimised so that when called with no block they delegate to exists?. For consistency and backwards compatibility you should always use exists?

akcheung commented 6 years ago

I can imagine cases where this can be the result of abstraction, say where(..) is wrapped into an API function foo and the caller, not knowingly what's underneath the covers, chains first after the call to foo, resulting in the inefficiency.

graingert commented 6 years ago

Sorry I don't understand again

On Thu, 28 Jun 2018, 23:49 Alvin Cheung, notifications@github.com wrote:

I can imagine cases where this can be the result of abstraction, say where(..) is wrapped into an API function foo and the caller, not knowingly what's underneath the covers, chains first after the call to foo, resulting in the inefficiency.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hyperloop-rails/static-checker/issues/2#issuecomment-401196214, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZQTLCOYmF25bTjykL4w3B4UHQKAPuuks5uBV2HgaJpZM4U7fum .

congy commented 6 years ago

@graingert yes you are right currently exists? and any? are not different. Still in the example first? is slower since it is doing extra ordering. Some applications we checked were developed a while ago and require an older version of rails, and that's why we found this inefficiency. We will add static-checkers for different versions of rails later.

congy commented 6 years ago

@itsderek23 thanks for pointing it out. But it is still important to check whether people misuse first? or find_by where they can use any?/exists? :)

graingert commented 6 years ago

I see what you're saying now and totally agree. I also think it's worth having a warning for .any? and .many? because there should be exactly one way per thing of doing things

itsderek23 commented 6 years ago

Thanks @hyperloop-rails - so my understanding with Rails / ActiveRecord 5.0:

No reason to check any? vs. exists?

It looks like this has been removed: https://github.com/hyperloop-rails/powerstation/blob/master/static-checker/string-matching.sh

Keep the find_by vs. where().first check

where().first adds sorting by primary key.

I'll close this - thx for clarifying!