mbj / mutant

Automated code reviews via mutation testing - semantic code coverage.
Other
1.96k stars 153 forks source link

Question: RSpec selection overlap #374

Open tjchambers opened 9 years ago

tjchambers commented 9 years ago

Just when I thought I was getting to understand this a new issue popped up:

I am issuing the following:

RAILS_ENV=test bundle exec mutant -r ./config/environment.rb -j2 --use rspec -I app/models/perform/activity.rb Activity > ./log/mutant/Activity.log

The header from mutant looks like:

Matcher:         #<Mutant::Matcher::Config match_expressions=[<Mutant::Expression: Activity>] subject_ignores=[] subject_selects=[]>
Integration:     rspec
Expect Coverage: 100.00%
Jobs:            2
Includes:        ["app/models/perform"]
Requires:        ["./config/environment.rb", "activity.rb"]

I thought my selection of examples would come from activity_rspec.rb, but I see the following in my logs:

- rspec:2:./spec/features/align/activity_log_spec.rb:9/Activity Log Admin views Assessment Activity log
- rspec:167:./spec/features/perform/activity_log_spec.rb:9/Activity Log Admin views Scorecard Activity log
- rspec:168:./spec/features/perform/activity_log_spec.rb:15/Activity Log Admin views RoadMap Activity log
- rspec:169:./spec/features/perform/activity_management_spec.rb:9/Activity Management User views an Activity
- rspec:170:./spec/features/perform/activity_management_spec.rb:15/Activity Management User views all Activities for a scorecards
- rspec:171:./spec/features/perform/activity_management_spec.rb:21/Activity Management User edits an Activity
- rspec:172:./spec/features/perform/activity_management_spec.rb:28/Activity Management User removes an Activity
- rspec:173:./spec/features/perform/activity_management_spec.rb:35/Activity Management User adds an Activity
- rspec:174:./spec/features/perform/activity_statuses_spec.rb:9/Activity statuses Admin lists Activity Status
- rspec:175:./spec/features/perform/activity_statuses_spec.rb:15/Activity statuses Admin creates new Activity Status
- rspec:211:./spec/features/solve/activity_log_spec.rb:9/Activity Log Admin views Activity log
- rspec:2631:./spec/models/perform/activity_spec.rb:7/Activity inactivates the swimlane
- rspec:2632:./spec/models/perform/activity_spec.rb:17/Activity inactivates the phase
- rspec:2633:./spec/models/perform/activity_spec.rb:27/Activity automatically assigns the default activity owner
- rspec:2634:./spec/models/perform/activity_spec.rb:40/Activity handles missing default objective owner
- rspec:2635:./spec/models/perform/activity_spec.rb:46/Activity is not allowed on an AOP
- rspec:2636:./spec/models/perform/activity_spec.rb:51/Activity provides title as to_s
- rspec:2637:./spec/shared_model_stuff.rb:3/Activity behaves like real_model .new has a valid factory
- rspec:2638:./spec/shared_model_stuff.rb:9/Activity behaves like real_model #destroy can be destroyed

Because activity is a word used in other contexts, it appears the filter is every spec that has the word activity somewhere in the name including ActivityLog, ActivityStatus, 'Activity Management'. I know there must be a way for me to constrain this, but I am not sure how to specify. I am inferring from the output that it is looking for specs which start with 'activity' because it is NOT including a spec for SupportingActivity. I cannot tell whether it is looking at just the file name, or the example name, or a describe block.

I tried looking at the source and was unsuccessful in understanding what it was looking at to filter.

I could use some guidance.

tjchambers commented 9 years ago

The more I think about this issue, the more I feel as though my performance I experience with mutant is a direct reflection of MY fundamental lack of understanding of this expression matching subject. My inference is that the numeric value to the right of the word 'rspec" is an indication somehow of the number of examples being considered for a particular mutation.

@mbj you remarked in a twitter conversation if I recall a while back that you had mutated ~1000 LOC in 17 seconds (I am paraphrasing). You know I am doing my mutations on activerecord models in Rails without namespacing of the models - NO namespacing for the models at all.

My inference (or rather my gut) is telling me that the performance could be a lot better. Even when I have a model where the methods do simple arithmetic or provide a symbol or a number (i.e. no db access) I can't seem to get any better than one mutation per second. Most of my models have at least 100 mutations so I can't get them done in under a minute. Paralleling the jobs helps, but I am concerned that too many potential examples are being considered, because I am not specifying on the command line something that could easily narrow the field.

I also infer that namespacing would play a bigger role. When I look at the mutant source code there is a high degree of namespacing - very appropriately in my opinion. That implicitly would seem to play a role in narrowing the examples considered.

I am declaring that my gut tells me I am missing something that would greatly speed up my mutation runs, even as I look at the alive mutations and see for the most part (excluding the above) that the spec examples I see listed above an alive mutation appear to be those I would have anticipated.

@mbj you said in a number of tweets, and my experience concurs, that proper test selection is critical. I just have this feeling like I am not narrowing the candidates test suite enough, but without profiling my mutant run to see where the time is being spent, my feeling is it is not mutant, nor rspec that are slow, but rather I am not being specific enough in my command.

I'll stop there and await your questions or thoughts.

mbj commented 9 years ago

@tjchambers I do not have the time to write a verbose message on this issue. There is a lot to include to write a full documentation. Before I do a misleading / incomplete comment I prefer to do none.

Also its unlikely I'll find the time in the next 14days sorry.

tjchambers commented 9 years ago

No attempt to hurry you - take your time. I feel like this is my failing so I wanted to declare that and include the context so I am not excluding an important detail.

tjchambers commented 9 years ago

Please do not read till you are back from holiday!

I did a rough non-scientific test yesterday of my hypothesis that the number of specs defined in my app correlate to individual mutation command performance. I will say up front that there are lots of moving variables, so I tried to change one thing only - the number of specs defined in my app.

Here is what I did:

RAILS_ENV=test bundle exec mutant -j1 -r./config/environment.rb -Iapp/models/misc/tasks -rresults_invites_task.rb --use rspec ResultsInvitesTask > log/mutant/ResultsInvitesTask.log

results:

Mutant configuration:
Matcher:         #<Mutant::Matcher::Config match_expressions=[<Mutant::Expression: ResultsInvitesTask>] subject_ignores=[] subject_selects=[]>
Integration:     rspec
Expect Coverage: 100.00%
Jobs:            1
Includes:        ["app/models/misc/tasks"]
Requires:        ["./config/environment.rb", "results_invites_task.rb"]
Subjects:        6
Mutations:       99
Kills:           64
Alive:           35
Runtime:         1620.70s
Killtime:        1400.61s
Overhead:        15.71%
Coverage:        64.65%
Expected:        100.00%

Then I cloned the Rails app and removed about 80% of my spec files. I then ran this again and the total run time was 625 seconds. That is about 40% of the original runtime. I ran it again and the total runtime was 587 seconds. Non-scientifically it seemed like the startup time - time to first mutation cycle output after initialization was approximately the same. My inference this is spec loading and Rails class loading. Note I only removed spec files, all lib, model, etc. files in my app were left constant.

Given that this seemed to speed things up I removed ALL spec files except helper and support files and the spec for the class being mutated. Results:

Mutant configuration:
Matcher:         #<Mutant::Matcher::Config match_expressions=[<Mutant::Expression: ResultsInvitesTask>] subject_ignores=[] subject_selects=[]>
Integration:     rspec
Expect Coverage: 100.00%
Jobs:            1
Includes:        ["app/models/misc/tasks"]
Requires:        ["./config/environment.rb", "results_invites_task.rb"]
Subjects:        6
Mutations:       99
Kills:           64
Alive:           35
Runtime:         351.15s
Killtime:        243.69s
Overhead:        44.09%
Coverage:        64.65%
Expected:        100.00%

Again I observed a measurable and significant reduction in runtime by reducing the number of specs leading to a 75% reduction in mutant runtime by simply removing non-involved specs.

My inference from the above is that the number of specs I have (and therefore number of examples) - specs that over 99% do not cover the module under test - play an active role in the mutation/run spec cycle - not just the startup cycle. The startup times seem in the same ballpark - approx 20 seconds before first mutation cycle is logged. However the individual mutation cycles themselves click off faster when I have fewer spec files. I tried to minimize the number of jobs so that overlap, etc did not influence the results.

There are a number of other considerations I cannot ignore. More spec files interrogated to determine the appropriate universe of examples for a given mutation run also mean loading them (I expect once except for those actually being used to test the mutations). Loading them may involve class loading and constant lookup, etc. So removing specs removes any downstream effects of loading and interpreting those specs.

I am going to review this further and see what I can interpret from mutant-rspec. I did a cursory look at it yesterday and it looked as expected - the universe of specs is examined AT LEAST ONCE. I would expect that. I am going to try and understand whether this occurs only once, or more than once, or because there is some parallelization that requires this once per process, etc.

I respect that you may already have good explanations for all of this, and see my efforts as a waste of time. It may even aggravate you that I am doing this while you are at leisure. Please resist that thought. I consider this instructive and educational for a tool and an infrastructure (RSpec) that I need to know extremely well. This is fruitful even if it just lets me read and appreciate excellent examples of code with a useful purpose. It deepens my tool knowledge and knowing your tools well is valuable in itself.

Highly unscientific but compelling given the opportunity for a potential 4x improvement in mutation speed.

If this wasn't Open Source I would not have this wonderful opportunity. I do appreciate it and all the work you and colleagues have done to get this where mutant is today. At some point perhaps I can make a useful contribution.

tjchambers commented 9 years ago

My inference from further experimentation in this area is that there are a number of MY contextual settings that conspire to degrade the performance of mutation testing in my situation. They are:

Based on the above construction, my theory is that when a worker filters the specs, by merely interpreting the spec file it is invoking the class loading overhead and object creation associated with that, and so each worker has an enormous amount of non-value added overhead for each mutation.

Based on the above theory, my goal was to limit the specs which are presented to a mutation run for a class or module. As you know, @mbj, I target all of the unit tests for a class in a corresponding companion spec file, so looking through the universe of my spec files is non-value added. I KNOW which spec file has the examples that will (should?) kill the mutations.

So I added a pre-step to my RAKE processing to create a customized .rspec file with the content

--pattern '<spec_file>'

just prior to mutating a specific class. This has netted a 2x to 10x speed improvement in my mutation process runs.

Based on the above kludge, and my chosen work process, it would be handy to be able to specify this --pattern option on the mutant command line to be passed as an option if provided to rspec during startup (in addition to the "spec --fail-fast"). If I then chose to do so I would achieve one of my tenets which is to narrowly target the specs by file and/or example naming to focus specs onto the associated mutated methods.

mbj commented 9 years ago

@tjchambers This one is easy to address, but I do not have the time to write it down yet. I'll get to this one soon. Sorry but I simply do not have the required amounts of OSS time to finally write down everything you need.

tjchambers commented 9 years ago

@mbj No problem - my workaround works and is saving me loads of time. I will add that unless I am missing something I could do already, I consider this low priority as it only seems to affect my situation.