sixty-north / cosmic-ray

Mutation testing for Python
MIT License
565 stars 57 forks source link

Execute job without mutation first: If #459

Closed Alcolo47 closed 5 years ago

Alcolo47 commented 5 years ago

it's fails, we can stop now wtthout spending ours time!

Results are now stored in db as soon as they were available, not at end. (Removing pending = list(pending_work))

abingham commented 5 years ago

We used to do this, but I removed it because it was almost always a waste of time. Users don't generally run CR unless they know their tests already pass. If their tests already pass, then there's not need for CR to verify this.

I'm not completely opposed to the idea, but is there any reason to do the unmutated run besides this check?

abingham commented 5 years ago

Also, this PR seems to conflate several changes, in particular a) executing a non-mutated test run and b) not copying the pending work list. Could you separate the different topics into different PRs?

Alcolo47 commented 5 years ago

We used to do this, but I removed it because it was almost always a waste of time. Users don't generally run CR unless they know their tests already pass. If their tests already pass, then there's not need for CR to verify this.

I'm not completely opposed to the idea, but is there any reason to do the unmutated run besides this check?

My reason is: I my project, I currently falls in this problem: In my development environment, all works fine. In gitlab ci: The same fails. I spent some hours before understanding that in cosmic-ray context, my tests don't works.

My stacks:

abingham commented 5 years ago

How about a different approach: we could add a subcommand (verb) that runs an unmutated version of the code in the context of CR, i.e. just like mutated code. If this returns successfully, you could then run your mutation suite.

This approach has at least two benefits. First, users who don't want to pay the cost for the unmutated run don't have to. Second, it removes a lot of the complexity of your PR...we don't need a magically named job, we don't need special handling for work items with no mutation point, etc.

Ultimately, what I'm describing is our old "baseline" subcommand which ran an unmutated version of the code to calculate a runtime baseline.

Alcolo47 commented 5 years ago

Yes this can be a good approach, another verb or an option for "exec". My mistake was to enter this "no_mutation" result in db. It should be better to display errors of this first run in stdout instead of trying to make stats and reports with it.

abingham commented 5 years ago

I'm not sure about the direction you've taken in this implementation; it's a bit too brute force. You've added a flag that needs to be threaded into the execution engine, and the engine needs to know to do special things when it sees that flag. One manifestation of why I think this is the wrong approach is that it requires every execution engine implementation to handle it separately. I'd prefer something that works with the current design and is as orthogonal to the rest of the system as possible.

I see think of this feature like this: let users run their tests in exactly the same environment as a mutation run - same execution engine, cloning method, etc. - but without the mutation. As such, my instinct right now is to create an operator that does no mutations. For the "no mutation" run, we could simply run a session with only that operator. Of course, we'd want to avoid using that operator in testing runs.

This approach might be tricky, and perhaps it's too clever, but it's the flavor of what I'm thinking of

On the other hand, if we don't care about completely replicating the mutation run environment, then the "no mutation" run could could bypass execution engines, clones, and all of that, simply running the test command. But it sounds like your situation - and probably that of others - is that they need to very the execution environment as much as possible.

abingham commented 5 years ago

Have a look at the no-op-operator branch. This adds a "no-op" operator that can be used to run unmutated code using all of the machinery of CR. If the goal is to verify that your system runs correctly under CR, then I think this is about as close to that as you can get. Does this let you do what you need to do?

Alcolo47 commented 5 years ago

I have a look in the no-op-operator branch, but you will launch a run for each parso node of the project. I missed something ?

OK I will rollback asyncio changes and make a new PR for this.

abingham commented 5 years ago

I have a look in the no-op-operator branch, but you will launch a run for each parso node of the project. I missed something ?

If you use the no-op operator as a normal operator, then yes, it would apply itself to every node in the tree. However, we use it in a very special way, only ever creating a single work-item for it in the baseline command.

Since no-op has the behavior you mentioned, we don't include it in the sequence of operator that OperatorProvider returns. The only way to really access it is by specifically asking for it by name. I'm no 100% satisfied with this solution, but it's very unintrusive - i.e. it leverages all of the basic CR infrastructure in a "natural" way - and it's the best of the options I've been able to think of.

abingham commented 5 years ago

I will rollback asyncio changes and make a new PR for this.

In this new PR (or perhaps in an issue) I'd like you to explain why this change is needed. Is it technically superior in some way? I'm very reluctant to accept changes that only reflect a personal preference.

abingham commented 5 years ago

@Alcolo47 Does the no-op-operator branch give you what you need? If so, I can merge it in soon.

abingham commented 5 years ago

@Alcolo47 Based on #473 it seems that the no-op-operator branch works for you. If that's the case, please let me know and I'll merge it in. Then we can work on merging in #473.

Alcolo47 commented 5 years ago

OK, for no-op solution.