Criteria/Classification for Mutations?

tjchambers commented 9 years ago

I was reading through the source code in an effort to better understand the classes of mutations and I got to wondering what the philosophical criteria was for choosing/implementing mutations. Basically the question that I ended up having is what makes a suitable mutation?

In my experience with mutant I synthesized some categories (incomplete):

Promotion to greater strength (== -> eql? or equal?)
Omission (remove an element in a parameter list or array and deduce if it is critically tested)
Boundary checking (>= -> == or >)
Ordering (reverse_map => map)
Numeric negation (1 -> -1)
Predicate replacement (predicate -> false, self, nil)

I am certain this is a extremely rough and surely incomplete taxonomy.

One area where I fail to see mutation types (at least so far) is what I would call opposition. Not being schooled on mutation testing per se, I don't see many replacements of one method with it's diametric opposite. For example (and vice versa):

to < or <=

select to #reject

odd? to #even?

My inference is that most of these would perhaps NOT catch bugs (hopefully), and some could exacerbate infinite looping.

Thoughts?

dkubb commented 9 years ago

Could another term for this be an "inverse function"?

mbj commented 9 years ago

I call them: Semantically reducing operators. And Orthogonal operators.

#[] to #fetch is an instance of "reduction operator". #> to #< is an orthogonal one.

mbj commented 9 years ago

At the source code level there is no distinction being made. As I do not see a reason to do so. It could be included in the report, but I doubt it would help anyone.

mbj commented 9 years ago

The classes @tjchambers made up are more fine grained, but I think they all fit into reduction vs orthogonal.

tjchambers commented 9 years ago

@dkubb Sounds clearer to me that my term.

@mbj Orthogonal is good too.

I guess my question is really - is there a reason to NOT perform these orthogonal mutations?

mbj commented 9 years ago

@tjchambers Orthogonal operators are the ones where most equivalent mutations are coming from. But they are very helpful for example turning literals into others, example true to false.

dkubb commented 9 years ago

I think a clear set of terms to describe the different classifications would benefit mutation testing since all authors have a common vocabulary in which to share ideas. Right now there is some agreement, but it's never been formalized.

mbj commented 9 years ago

I think a clear set of terms to describe the different classifications would benefit mutation testing since all authors have a common vocabulary in which to share ideas. Right now there is some agreement, but it's never been formalized.

I agree with that. But we should have at least an 2 level specification. First reduction vs orthogonal than a second level like the list @tjchambers posted initially.

tjchambers commented 9 years ago

The ability to classify the mutations for me helped me envision additional ones. SO I felt more engaged in the process of considering what makes a valuable mutation, as opposed to feeling like a bystander using a black box tool. My two cents.

mbj commented 9 years ago

The ability to classify the mutations for me helped me envision additional ones. SO I felt more engaged in the process of considering what makes a valuable mutation, as opposed to feeling like a bystander using a black box tool.

I think that the mutant mutator DSL could be changed from emit(mutation) into emit_reduction(mutation) and emit_orthogonal(mutation) that would allow to attach the "type" of mutation to the object model, allowing for having its type in the report.

tjchambers commented 9 years ago

I would add that classifying the mutations allowed me to develop mitigation strategies in my code. It helped me develop a mental cheat sheet of how to code to kill mutants and as a byproduct write more consistent and better code.

dkubb commented 9 years ago

When reporting a mutation failure would it help to say what kind of mutation it was that failed? That would allow people to form the mental links between the classification and the strategy they need to use in the specs to kill that kind of mutation.

mbj commented 9 years ago

I would add that classifying the mutations allowed me to develop mitigation strategies in my code. It helped me develop a mental cheat sheet of how to code to kill mutants and as a byproduct write more consistent and better code.

The mutant meta DSL should than also be changed to reflect the class of mutation.

The nice thing is that with such a classification mutant now has 2 dimensions to classify mutations:

The classification we talk about here
The AST node type mutant uses right now

This will make it possible to render some nice documentation from the meta. I actually planned this for my commercial addon (product playground) service anyways. But I've to admit I lost interest in that one a bit as my interest in ruby declines.

tjchambers commented 9 years ago

@dkubb Absolutely - having the type in the report would much more quickly allow me to recognize it than having to back into what is it trying to tell me. Ideally I would eventually be able to (automagically) say "Tim, you forgot to test for X condition" without hesitation or cogitation.

dkubb commented 9 years ago

This will also put some positive pressure on us to form a taxonomy of well-named mutation operators.

I noticed when using tools like reek, when I see a method has a "utility function" smell I know exactly how to handle that case based on lots of experience with the tool.

dkubb commented 9 years ago

I also like how reek has documentation on what each smell looks like along with possible fixes in some situations, eg: https://github.com/troessner/reek/tree/master/docs

tjchambers commented 9 years ago

This is probably OT but one of the time-wasters I find is when I discover a sequence of unkilled mutations that are BIZARRELY and obviously NON-EXECUTABLE which eventually I trace to a code path not covered because of a mutation above. If somehow it would slap me when that occurs it would save me much time.

mbj commented 9 years ago

I support the introduction of such a taxonomy. But I do no see me doing it. For my personal mutant experience there are more pressing issues to address. And my OSS policy is to make myself happy first. Than make my clients happy, than everyone else.

This results in various delays for features in public demand, because I do not have demand for them.

tjchambers commented 9 years ago

@dkubb I like the reek documentation as well - although it is somewhat buried. I do it though. That kind of documentation should be near the code, but it also should be accessible to the user as well.

mbj commented 9 years ago

This is probably OT but one of the time-wasters I find is when I discover a sequence of unkilled mutations that are BIZARRELY and obviously NON-EXECUTABLE which eventually I trace to a code path not covered because of a mutation above. If somehow it would slap me when that occurs it would save me much time.

Mutant could feed back the mutation results into the engine, stopping descends into a node when a direct mutation on the parent failed already. For my personal use I do not need such a feature, because I probably got trained already to see a "chain of uncovered mutations" as "Ahh a full unexecuted method, line, branch".

@dkubb I like the reek documentation as well - although it is somewhat buried. I do it though. That kind of documentation should be near the code, but it also should be accessible to the user as well.

Thats what I worte the meta for, for have a place to combine documentation & spec. (The meta is the spec for the mutation engine).

tjchambers commented 9 years ago

I spoke elsewhere of the friction to mutation testing, and things such as the above (requiring experience to deduce the action to take, and to recognize patterns quickly), are where the personal costs outweigh the perceived benefits early on. Not a criticism of the tool or your efforts, but IMHO a reason for mutation testing to lack traction.

If I had to internalize how to disassemble and reassemble an internal combustion engine to be able to drive a car, few people would ever get their license.

mbj commented 9 years ago

I spoke elsewhere of the friction to mutation testing, and things such as the above (requiring experience to deduce the action to take, and to recognize patterns quickly), are where the personal costs outweigh the perceived benefits early on.

I agree. But its unlikely I can change that. Unless someone funds me to address the areas "I do not need". I'm not "begging" with that message. Just stating it as a fact.

I need to use my limited OSS time in my personal interest, else I'll probably get broke and will not have OSS time at all. I value others people input, and where I see compound interest I factor their needs in. But adding a feature I'd never use would be a pure waste of my OSS time. Writing good tools is nontrivial, without my priorization on my needs the tool would still be toy level and not usable at all. It would even be unusuable for people who got the experience.

Not a criticism of the tool or your efforts, but IMHO a reason for mutation testing to lack traction.

I'm fine even if it where criticism. I love criticism, as it triggers me to write down my arguments in a consistent way, or fix my decision when such arguments cannot be serialized to words ;)

tjchambers commented 9 years ago

@mbj your continued openness and willingness to entertain these discussions given your declining interest in Ruby is from where I sit remarkable. I tend to step back and look at it from a language-agnostic viewpoint and say what would a mutation testing tool for the the next great language require (besides a language with static type checking :) ). SO my comments are not Ruby-centric. I wonder what promise a tool needs to have to combine with a community and perhaps a bounty/kickstarter to make it self-supporting as a reinforcing loop. Truly OT.

mbj commented 9 years ago

@tjchambers Even when I'd write a tool for another languages, the decisions will be the same: My needs first, than client, than everyone else - So we can keep on discussing ruby to keep the discussion less abstract.

dkubb commented 9 years ago

I tend to step back and look at it from a language-agnostic viewpoint and say what would a mutation testing tool for the the next great language require

One of the things that @mbj and I have been discussing is porting mutant to Haskell, so it can be used to test Haskell as well as implementation of other languages written in it, like Idris and Purescript. Given how close those are to Haskell in terms of syntax, it might be possible to then port to those languages with less pain than ruby to Haskell.

A language with dependent types like Idris will allow you to specify a tremendous amount in the types themselves, and I see mutation testing finding the holes in the specification. I'd love to see a kind of "arms race" between the type system and mutation testing; the "perfect" dependent type system would allow you to specify everything in the types and still pass mutation testing.

EDIT: If we can get a language to adopt mutation testing early on, it will also have a tremendous impact in the quality of the standard library since it will be possible to know how effectively it is tested based on feedback from the mutation tester.

tjchambers commented 9 years ago

@dkubb I inferred that would be a logical discussion given the sentiments around Ruby. I too have dabbled with Haskell.

You can probably see that my viewpoint is around what would I as a practitioner of language X desire of a mutation testing tool in support of that language. Obviously each language by it's nature would obviate some types of mutations that other languages might require as critical (especially around typing - static or inferred). However in stepping back out, from my user viewpoint, my uptake would be expedited by having the following:

Clear as possible understanding of the categories of mutations (why they are there, what action(s) they suggest, how to "code for less mutations" which seems to encourage better coding practices)
Concise reporting of the before and after to grok the gap in the spec (where is the issue, what was evaluated, what is the context)
A flexible manner to target the narrowest possible specs necessary to surface the alive mutants (aka - don't waste any more time on an already computationally-expensive process)
Ability to see the big project picture (where do I stand overall, what gets quick wins, what is critical and still has low mutation coverage)

Bottomline for me I would need to see this as a valuable use with high payback for my time investment, given at the current state it is still requires a human to determine the proper alteration to specs/code and implement it. The above is my current thinking of what would drive me to make that time investment.

Mutant for Ruby (this project) hits very well on many of those notes, and to the extent it falls short of some, it is not due to lack of recognition, but time over target by some key individuals with mouths to feed and arms pulled in many directions.

tjchambers commented 9 years ago

I wanted to add that you can include me as a person with "mouths to feed and arms pulled in many directions". Why do I use this tool?

Because when I am challenged by an alive mutation to enhance my specs, I am making an investment in the quality of my future product, and building a stronger safety net. When I - as is too often the case :) - find a bug as a result of a spec gap highlighted by an alive mutant, I am saving my clients the trouble of dealing with misbehavior thereby increasing the likelihood that they will not go elsewhere to find less bug-ridden alternatives. I am not so naive that I believe they will recognize how bug free the tool I provide is. They EXPECT that.

And mutant is one of those things that gets me there faster and with more focus than trying to dream up the essential set of comprehensive tests on my own.

tjchambers commented 9 years ago

My process of code improvement involves the following:

Detect a code issue or smell (either by observation or bug-driven)
See if it is a single case or a pattern of (mis)behavior by looking for the pattern everywhere
If it is a pattern and s/b corrected then do the correction as a complete task
Note the behavior to avoid in future

This is where mutant comes in to help. If I find that mutant is calling out a particular mutation repeatedly, and because I am running mutant on the entire code base and saving the logs, I have the potential to scan the logs for similar cases. This is where things currently fall apart, because the particular type of mutation that would be logged is not labeled.

So for instance when a mutation is added from @foo to foo without a label or tag in the log of some sort, it becomes difficult to use my above process to locate the other occurrences and fix them (hopefully once and for all) in the code base universally. So providing a label or tag to each alive mutant would help with this, much like reek's UncommunicativeName tags allow me to scan the logs an see where these exist in my code.

I understand this is subject to one buying into my approach. It could also be the foundation for being able to exclude certain types of mutations in the future.

dkubb commented 9 years ago

So providing a label or tag to each alive mutant would help with this, much like reek's UncommunicativeName tags allow me to scan the logs an see where these exist in my code.

@tjchambers Even if mutant doesn't label each mutation I think we should still define a common set of names to represent each class of mutation. It would give us a common language in which to discuss future mutations rather than describing things in terms of actual before and after code examples.

FWIW, I do think reporting the mutation classification could be quite useful. There will be standard approaches to dealing with each type of mutation, as there are with code smells, and it would make it easier to document these common fixes.

tjchambers commented 9 years ago

@dkubb I see the tagging of a type of mutation, and the classification of a tag or group of tags as intertwined. While broader classifications may be sufficient to the approach to the mutation, it would personally help me to connect the dots back to the specific mutation.

I do realize this would require assigning a label/tag/type/whatever-you-want-to-call-it to each mutation. It's not magic unfortunately. I like the idea of being able to say it was a IvarToReader mutation, as opposed to it's the one that alters @foo to foo. Perhaps it's just me. I like to put things in boxes.

dkubb commented 9 years ago

@tjchambers it looks like we've been discussing this for a few months. I'm going to propose a way we can make forward progress on this, although I wanted to make a few statements first.

Just about everyone involved in mutation testing to you, @mbj, other contributors and myself agrees that there should be more common vocabulary around mutation testing. Some operators are specific to ruby, but many are relevant across multiple languages. The first step in establishing some common vocabulary is just defining names that make sense to us. As we all know, naming stuff is hard, and is likely to require more discussion and debate.

I think this discussion and debate is really important, naming isn't just hard, it's critically important. At the same time, I don't think we should make it a constraint that we have good names for everything before we get started on this task. If we can come up with some obvious placeholder names for mutations, we can add the infrastructure to assign the temporary names to mutations so they can be logged. Then we can gradually refine the mutation names by proposing pull requests and then discussing things in depth there.

So in summary we could:

Add a way to associate a name to a mutation operator
Assign unique placeholder names like Mutation1, Mutation2, Mutation3, etc to each operator.
Log the mutation operator names
Begin the process of naming each operator via PRs that are discussed and debated.

NOTE: I'm not specifically volunteering to do this since my OSS time is limited, just trying to outline a possible step forward where the end result would be well named mutation operators and a common vocabulary.

mbj / mutant

Criteria/Classification for Mutations? #395

select to #reject

odd? to #even?