CreateList is not a safe cop

ngouy commented 4 years ago

Description

rubocop autocorrect this :

10.times { create(:user, age: rand(10..0)) }

to

create_list(:user, 10, age: rand(10..0))

both obviously do not have the same output as in the corrected version rand is only evaluated once and will attribute its result to all users.

We can "prevent" this behaviour by using the n like 10.times { |n| ...}but this is just a corner case of multiple possibilities wherex.times` is just a legit call

Expected behavior

~this should not be a safe cop~

pirj commented 4 years ago

I would disagree that the only possible measure is to mark the cop as unsafe. Autocorrection for this cop is quite useful and works quite well in most cases. Keeping in mind that RuboCop runs only safe auto-correction by default, that makes its auto-correction feature merely reachable for everyday usage.

If a cop or its auto-correct is annotated as "not safe", it will be omitted when using --auto-correct

What do you think of adding detection of the cop and skip auto-correction of offences with method calls in arguments?

This is technically possible:

create_list(:user, 10, age: rand)

parses to:

(send nil :create_list
  (sym :user)
  (int 10)
  (hash
    (pair
      (sym :age)
      (send nil :rand))))

rand = 1
create_list(:user, 10, age: rand)

parses to:

(begin
  (lvasgn :rand
    (int 1))
  (send nil :create_list
    (sym :user)
    (int 10)
    (hash
      (pair
        (sym :age)
        (lvar :rand)))))

Notice the difference in the last line of ASTs. I've removed the argument to rand so that it's not obvious just by looking at it if it's a local variable or a method call. Ruby parser does a good job figuring this out.

Also, _list methods can be used with the block syntax, i.e.:

build_list(:user, 10) { |user| user.age = rand }

But careful, this shouldn't be used with create_list.

ngouy commented 4 years ago

Yes "sadly" it just doesn't work with the create_list, and/but that's the point.

Thank you for pointing out the AST differences. I am not yet very familiar with those.

will this technique be able to differentiate something like

rubocop/rubocop-rspec#1 
user = User.new(age: 1) # age is a persisted argument, and ORM expose it through/as a method
# bad
10.times { create(:user, age: user.age) }

rubocop/rubocop-rspec#2
user = User.new # age is an (weird) instance method that return a random number every time it's called
# good
10.times { create(:user, age: user.age) }

If not, should we then detect both as "valid" cases of the usage of "times" vs "create_list"? It would defeat the purpose anyway of trying to detect what cases are valid or not?

Again apologies for those questions if they may look stupid, as I am not familiar with AST, maybe that's the answer to it all ;)

pirj commented 4 years ago

That's right, user.age is a method call, and will be detected as such. With static code analysis, we never know for sure if the returned value is constant or not. So my suggestion is to skip auto-correction when a method call is amongst arguments.

ngouy commented 4 years ago

So if I get this right there are 2 choices :

keep it safe, but never correct when a method is used in arguments (even if they may just be attr_accessor, which is (I guess it depends on your codebase) a pretty common use case)
make it unsafe, cons are that it never correct for bad uses even when we know for sure they are bad, but can still be run unsafely on any current detections, where most of them are indeed correct "bad" detection

I agree the first one is more tempting, but I feel like it will prevent correcting lots of real "bad" example (but not anymore detected as a bad case, for example, every time you are using an attr_accesor) than it may correct bad and detected positives

ngouy commented 4 years ago

Do detection and auto-correct are bound? Maybe we can still mark every one "bad" usage (like today, even with false detections), but only auto-correct the ones we know for sure they are (the one without method calls)?

pirj commented 4 years ago

We capture arguments:

          def_node_matcher :factory_call, <<-PATTERN
            (send nil? :create (sym $_) $...)

but don't seem to use them. I suggest adding a similar node matcher (notice I've replaced a long construct with nil? for simplicity),

          def_node_matcher :alarma_method_call_in_arguments?, <<-PATTERN
            (send nil? :create sym ... (hash <(pair _ send)>)

This basically reads as "return true if the input is a create call with a symbol, and ending with a hash where one of the values is a method call". And throw in a call to this method somewhere here:

- CreateListCorrector.new(node.send_node).call(corrector)
+ CreateListCorrector.new(node.send_node).call(corrector) unless alarma_method_call_in_arguments?(node.send_node)

Do detection and auto-correct are bound?

Not really, you can add an offence and then skip the correction just by not using the corrector.

This should do the job. Add some specs to cover some basic cases with expect_offense and expect_no_correction. I'll be happy to provide further assistance to iron it out, and the fix is good to go.

This cop really needs some love.

ngouy commented 4 years ago

What I don't quite understand is that if we keep mark as "bad" pieces were alarma_methods are detected (even if we don't autocorrect them)

pirj commented 4 years ago

Fair enough! Agree that adding return if alarma? and skipping the offence just as well makes sense.

ngouy commented 4 years ago

That's what is complicated with this issue, and sorry to challenge it so much.

By doing I am afraid that we will hide a lot of false positive offences. They will not appear as offences but will be. If I am just opening this issue now, it's that 99% of the time this cops do this job, and I am afraid we will misguide a lot of people if we "silence" it each time a method is called

But on the other side it doesn't make much sense to, having a "safe" and auto-correctable (also safe) cop, to mark offences without correcting them.

I think we have a real issue on this one

ngouy commented 3 years ago

@pirj any updates? It seems really difficult to find a compromise here, yet there is a real issue

pirj commented 3 years ago

I still stand by

Even though this cop is correction-unsafe, I'm more inclined to fixing this case rather than marking its otherwise useful auto-correction as unsafe.

Let's skip auto-correction when there are method calls in arguments.

ngouy commented 3 years ago

@pirj all good for me

Let me recapitulate:

we continue to detect offences as before (aka the actual behaviour that detects all true offences but also some false positive). If so, we have to pass the cop as unsafe (not talking about auto-correction safe here)
we only autocorrect for cases where no methods are called (when we are sure 100% they are true positive offences)
thus we can keep it as an autocorrect safe cop

pirj commented 3 years ago

Sorry, it slipped my mind that we've discussed above that we shouldn't detect the case with a method call in arguments as an offence.

So:

skip detection of FactoryBot create_list method calls in arguments
keep the cop safe and autocorrect-safe

Does that sound good to you?

ngouy commented 3 years ago

What bothers me with this, is that we will kick our a lot (if not most) of true positive offences.

Indeed it is really common to do:

create(:user) do |user|
  create_list(:book, 3 author_id: user.id) # I know we can use `author: user` but its to expose my point
  # or the bad one, that now will not be an offense anymore
  3.times { create(:book, author_id: user.id) }
end

Your solution seems easier to apply. I'm ok to go with that I just want to point out that this will probably kill a lot "it happens really often in real-life use cases" true positive detections

pirj commented 3 years ago

As I've mentioned somewhere above, this cop is far from perfection. We'll keep it safe, and this is what matters. I can surely see a different FB cop that would advise using author: user.

Personally, especially after https://github.com/thoughtbot/factory_bot/issues/1444 pitfall, I strongly prefer n.times syntax.

ngouy commented 3 years ago

I will probably send try to send a PR by the end of the week

pirj commented 3 years ago

Sounds great. Please don't hesitate to send the PR early and asking questions, writing cops is tricky at first.

ngouy commented 3 years ago

@pirj PR in draft √

Darhazer commented 3 years ago

@ngouy @pirj what if it auto-corrects non-literal nodes to block syntax like 10.times { create(:user, age: rand(10..0)) } becomes

create_list(:user, 10) do |user|
  user.age = rand(10..0)
end

or

create_list(:user, 10) do |user|
  user.update(age: rand(10..0))
end

I hacked also an improved auto-correct for this but considering what @pirj said about those attributes not being persisted or the need for extra DB call to persist them, I'm not sure if that's the right solution.

Also, do we need a cop to warn about attributes set in the block of a create/create_list ?

pirj commented 3 years ago

To be on the safe side from my point of view we should ignore iterators that call create with non-literal arguments. You are absolutely correct about the implications.

do we need a cop to warn about attributes set in the block of a create/create_list ?

Yes!

amomchilov commented 9 months ago

Hey guys! I had a non-safe-autocorrection happen:

- 4.times { create(:product_review, created_at: Time.now.utc + rand(100).days) }
+ create_list(:product_review, 3, created_at: Time.now.utc + rand(100).days)

From my reading of the thread, this rand(100) should have been picked up as a method call, which should disqualify the offence for auto-correction. Is this a bug, or did I understand incorrectly?

Cheers.

pirj commented 9 months ago

It’s indeed a bug, @amomchilov , thanks for reporting.

pirj commented 9 months ago

https://github.com/rubocop/rubocop-factory_bot/issues/100

ydah commented 9 months ago

Hey guys! I had a non-safe-autocorrection happen:
- 4.times { create(:product_review, created_at: Time.now.utc + rand(100).days) }
+ create_list(:product_review, 3, created_at: Time.now.utc + rand(100).days)
From my reading of the thread, this rand(100) should have been picked up as a method call, which should disqualify the offence for auto-correction. Is this a bug, or did I understand incorrectly?

Cheers.

@amomchilov I didn't reproduce it. please let me know the result of rubocop -V.

rubocop / rubocop-rspec

CreateList is not a safe cop #1087

Description

Expected behavior