boxed / mutmut

Mutation testing system
https://mutmut.readthedocs.io
BSD 3-Clause "New" or "Revised" License
891 stars 107 forks source link

More powerful whitelisting system #47

Open blueyed opened 5 years ago

blueyed commented 5 years ago

A lot of surviving mutants in a project are due to changing constants, e.g. FOO = 1, where FOO itself is used in tests then also. Killing those would need a test where the value itself is used, which does not make much sense, does it?

I think it would be good if those mutations could be skipped (e.g. by not mutating anything that matches [A-Z_]+ maybe?

It would be good if those could be deselected either by name of the mutation (which would need to be something like mutate-constant-X), or by specifying a pattern of identifiers not to mutate. (it is not possible to deselect mutations currently, is it?)

boxed commented 5 years ago

This seems like mostly the same as what we discussed in https://github.com/boxed/mutmut/issues/28 but for different reasons obviously. Maybe one should be able to just plug in an arbitrary python function that gets called to ask for exclusion?

boxed commented 5 years ago

Actually, thinking more about this it would be interesting if you could explain more about your situation. In my experience mutating constants is one of the more useful things for finding lacking tests or even outright bugs. I don't understand your description.

blueyed commented 5 years ago

If you have EXIT_OK = 0 and then check if your program exits with EXIT_OK in some case, the test will also pass with EXIT_OK = 1 (because the constant is used in the test also). You would need a test that EXIT_OK == 0, or use the number itself in the test. But in this case the constant more or less defines the behavior.

boxed commented 5 years ago

Ah, I see. Well yea, in that case the current system to deal with it is # pragma: no mutate. What do you think about more advanced exclusions like having a callback you can specify in the config?

nklapste commented 5 years ago

Just to chime in for a similar issue. I was experiencing false positives due to mutmut modifying the __version__ within a __init__.py

--- graypy/__init__.py
+++ graypy/__init__.py
@@ -20,5 +20,5 @@
     pass  # amqplib is probably not installed

-__version__ = (0, 3, 1)
+__version__ = None

It would likely be good practice to avoid mutating misc python doc/notes variables e.g.

__author__ = "Software Authors Name"
__copyright__ = "Copyright (C) 2004 Author Name"
__license__ = "Public Domain"
__version__ = "1.0"
boxed commented 5 years ago

Yea, maybe a global and built in whitelist would be a good idea. I have another candidate for that:

__import__('pkg_resources').declare_namespace(__name__)

we always have to put a pragma no mutate on those.

boxed commented 5 years ago

@nklapste I've released a new version of mutmut that ignored a bunch of these by default. It's also got an overhauled cache system that is way better. Give it a try!

lundybernard commented 5 years ago

A related issue that I'm dealing with: I want to skip mutations on log messages. I normally avoid making assertions about log messages, as it feels more like locking down the code than a behavioral test.

ex:

 log.debug(
     'whatever message you like, maybe its long and'
-     'broken up over multiple lines'
+    'XXbroken up over multiple lines'
 )

# pragma: no mutate isn't a good option at a cost of 19 characters per line

blueyed commented 5 years ago

A more flexible approach, that could live outside the actual source would be nice. E.g. having a callback in the config like mentioned already. Not sure what information it should get, but e.g. matching regular expressions against (original) code lines would be useful there I think.

boxed commented 5 years ago

For the case @lundybernard has, the line obviously won't do because the mutated string is on a different line than log.debug, so maybe we should just pass the entire multiline statement.

If we allow python code in the config file, then we should probably require (and validate!) that the callback provided has **_ in the parameter list so we can add keyword arguments when we find new uses cases down the line.

djmattyg007 commented 5 years ago

I can't help but feel that needing to achieve 100% killed mutants will always be an endless game of whack-a-mole that isn't worth playing. I would prefer a different strategy that accepts the fact that false positives will occur.

Take a look at how the PHP mutation testing framework 'Infection' handles this:

https://infection.github.io/guide/using-with-ci.html

The notion of not being able to achieve 100% killed mutants is baked into the design of the tool. I think this is a much better approach than requiring that everything be whitelisted.

boxed commented 5 years ago

Hitting any percentage above zero is also an endless game of whack a mole to be fair :) It's just harder to reach 100% the first time, that's the big difference.

I do agree your underlying point though: ratchet are a powerful tool to be able to slowly make things better instead of always be perfect or giving up.

So yea, this is a feature worth having. But I do get the feeling that we'd like something smarter! Just having a fixed percentage for the entire project doesn't seem like a very good ratchet to me.

magwas commented 5 years ago

In my case self.flag:bool = False was mutated to self.flag:bool = None

I would like to tell mutmut that None makes no difference in that context, but do not want to stop it from other mutations. (Yes, first I looked into possibilities to run the tests with a dynamic typechecker, but did not find any useable solution. Plus neither mypy nor pytype finds anything wrong with the mutated line...).

I would also like to comment on some use cases above:

  1. constants: I believe that constants in production code are either insignificant (like your version number), or else they should be actually checked against a copy in the test code (and constants/test artifacts should be organized to their own module(s) to avoid duplication within the test code). Of course this can be viewed as a matter of preference, but actually this is a difference between checking behaviour and implementation.
  2. log/exception/whatever messages: these should be i18ed anyway, which means that the string in production code is for machines, not people (even if it is equivalent with the English translation at the beginning). So if you want to change it because of users, you will change it in the po file. When the messages are built in a complex way, you defer turning data structures to text, and check different aspects in different tests, with as dumb text parsing as possible. This way one can avoid checking implementation details while message contents can be checked.
  3. what mutation coverage is needed: 100% of unit test coverage. Mutation coverage basically tells you whether you do TDD in the correct way. If not, then you have to continue learning. Of course there are cases which are impossible to unit test, but you do not unit test them in the first place. Once you can unit test a code, then you can unit test it correctly as well with little extra effort.
boxed commented 5 years ago

https://github.com/boxed/mutmut/issues/138 is the same thing as this issue really.

EmilStenstrom commented 5 years ago

Another use-case: I maintain a parser library, where parse errors are manifested as "raise ParseException" all over the code base. I don't care enough about the exact phrasing of the error messages to test for that string in all my tests, so mutmut (correctly) finds all instances of ParseException and shows them as errors.

I would like a way to filter away all those errors (which are about 50% of all errors in my case), so I could focus on the important stuff it find. Maybe I could run this in CI with a good way to filter out things I don't need tested?

"# pragma: no mutation" would work in my case, but feels ugly and verbose, and is easy to miss for contributors that want to add code.

Some ideas of how this could be handled:

1) Each hit is as a string to a filtering function you can specify yourself. 2) The AST is sent to a filtering function you can specify yourself 3) A list of exception strings in setup.cfg that the matched mutation line string could be matched against. 4) more...?

boxed commented 5 years ago

I had an idea.. what if the answer is "all of the above"? Matching a full line with "contains" is option number 1, then regexes as option number 2, then specifying a python function that gets passed the line and the AST node as number 4. But starting with 1 and maybe 2 of course.

The reason I'm thinking "all of the above" is because there is a huge usability and speed difference between those options. Just finding if a string contains another string is super fast and easy to understand so we prefer that if possible.

Any thoughts?

EmilStenstrom commented 5 years ago

To be clear: My use-case would be solved with just step 1.

boxed commented 5 years ago

I guess most use cases would be yes.

boxed commented 4 years ago

@joshz123 wrote:

I am using whitelisting with mutmut and to the best of my knowledge you can only whitelist with #pragma: no mutate It would be nice if there was some decorator or blockquote equivalent for whitelisting more than one line at a time. In my case, I had a very long hardcoded list of strings and had many failed mutations due to the strings being mutated even though the content of the string was irrelevant in this context.

boxed commented 4 years ago

It's not been very clear to me what we need. #pragma no mutate on a block? Or #pragma no mutate start/end? What would cover your use case? Or could regexes handle it?

joshzwiebel commented 4 years ago

pragma no mutate start/end would cover my use case. Sorry for any confusion

magwas commented 4 years ago

I would vote to an annotation instead of a pragma. Of coure it is just a matter of taste.

My use case would be covered by the ability to mark individual lines.

boxed commented 4 years ago

@magwas thst doesn't make sense. Annotations only exist for functions and classes, that's way too limited. And single lines already exist.

magwas commented 4 years ago

@magwas thst doesn't make sense. Annotations only exist for functions and classes, that's way too limited. And single lines already exist.

Well, ok, python is not java. Sorry.

boxed commented 4 years ago

I have an experimental hook API on master now that I have used to mutation test iommi. You can check out how I use it here: https://github.com/TriOptima/iommi/blob/master/mutmut_config.py I use it to narrow the test suite based on the file being mutated. This is dramatic speedup (although more mutants probably survive, which I actually think is good). It's the implementation of my thinking on testing and modules (https://kodare.net/2019/10/18/which_unit.html)

This system also has a way to skip mutations. Set context.skip = True and it gets skipped. You can use this to implement your own regex matching or anything else really. I would love it if someone who has this problem would like to try this out!

boxed commented 4 years ago

This is now released.

boxed commented 4 years ago

I've now added a bit of documentation in the readme on this.

kronenpj commented 4 years ago

Similar to #138 I'd like to see if a feature could be added to allow skipping mutations when a function is an instance of a specified object. Primarily my use case is: avoid mutating strings to be logged (instance of logging) so that I don't have slews of #pragma comments floating around. I'm sure this belongs in another issue, but the title of this one is sufficiently broad to bring it up here.

boxed commented 4 years ago

This is absolutely the correct issue!

I can't really use the type of the object, as mutmut is just working with the source code, not the running process. But it's pretty simple to add a rule that matches the line against log\.(.* or something. (Or even the AST node if you're worried about multiline log calls).

Would you like some help setting this up for your use case?

kronenpj commented 4 years ago

Please! Here are a few examples of the entries I'd like to avoid mutating:

log.debug(f"Debug argument is not valid") mylog.debug(f"Retrieving data for question ID: {qid}") input(f"Enter exam to generate {choice_list}: ") Etc...

I don't think having test cases for these is really the point of testing.

Thanks!

boxed commented 4 years ago

in mutmut_config.py add:

def pre_mutation(context):
    if context.current_source_line.strip().startswith('log.'):
        context.skip = True

that's the basic one, I hope you see how it's trivial to add mylog, input and whatever else you need.

kronenpj commented 4 years ago

Indeed, that will be easy to build upon. Thanks!

boxed commented 4 years ago

Let me know how it goes!

kronenpj commented 4 years ago

I have the desired exclusions working as I intended, though some I would have thought have been defaults. :) Here's a sample:

    if context.current_source_line.strip().startswith('sys.exit'):
        context.skip = True
    if context.current_source_line.strip().startswith('print'):
        context.skip = True
    if context.current_source_line.strip().startswith('log'):
        context.skip = True
    if context.current_source_line.strip().startswith('mylog'):
        context.skip = True
    if context.current_source_line.strip().startswith('argp.add_argument'):
        context.skip = True
    if 'outstream.write' in context.current_source_line.strip():
        context.skip = True
boxed commented 4 years ago

Well, it depends on the program too much to be a default I think.

kronenpj commented 4 years ago

Very true. However, more examples in https://github.com/boxed/mutmut#advanced-whitelisting-and-configuration would be helpful for those who are just finding this project. Looking at the code doesn't immediately provide satisfaction, especially burined 400+ lines into a 1200+ line file. That's not a complaint. I'm pointing out that the code I posted above still doesn't immediately flow from examining the source:

    @property
    def current_source_line(self):
        return self.source_by_line_number[self.current_line_index]
boxed commented 4 years ago

I added an example for the case about just matching the line. Seems like it should cover many use cases.

lmkawakami commented 1 year ago

This is absolutely the correct issue!

I can't really use the type of the object, as mutmut is just working with the source code, not the running process. But it's pretty simple to add a rule that matches the line against log\.(.* or something. (Or even the AST node if you're worried about multiline log calls).

Would you like some help setting this up for your use case?

can you show how to do it in the "AST node way" for multiline (log) calls?

e.g..:

app.logger.error(
    err,
    extra = get_extra_info(
        code = 500,
        error = ErrorInfo(
            msg=str(err), code=500, stack_trace=traceback.format_exc(), dbg_msg="type 1 error"
        ).dict()
    )
)
boxed commented 1 year ago

hmm.. Reading through the code, it seems doing it on the AST level with the current API isn't really a thing. You'd have to parse the file again with parso and then look at the AST....

EJahren commented 3 months ago

For my use, all of the suggestions are useful, but I cannot really persist a bunch of #pragma: no mutate in the source code when you work in a team and not everyone care about mutation testing. Also, I may not want to skip mutating the line all together, but rather I want to note down that I looked at a given mutation and found it to be benign. The way to do that in my mind is to mark it as benign in the cache. That way I can keep track of which surviving mutants I have not looked at.