Closed Alcolo47 closed 5 years ago
We used to do this, but I removed it because it was almost always a waste of time. Users don't generally run CR unless they know their tests already pass. If their tests already pass, then there's not need for CR to verify this.
I'm not completely opposed to the idea, but is there any reason to do the unmutated run besides this check?
Also, this PR seems to conflate several changes, in particular a) executing a non-mutated test run and b) not copying the pending work list. Could you separate the different topics into different PRs?
We used to do this, but I removed it because it was almost always a waste of time. Users don't generally run CR unless they know their tests already pass. If their tests already pass, then there's not need for CR to verify this.
I'm not completely opposed to the idea, but is there any reason to do the unmutated run besides this check?
My reason is: I my project, I currently falls in this problem: In my development environment, all works fine. In gitlab ci: The same fails. I spent some hours before understanding that in cosmic-ray context, my tests don't works.
My stacks:
local: pipenv, pycharm: When I launch tests from pycharm console, all works fine. I have configured cosmic-ray with [cosmic-ray.cloning] commands = ["pipenv sync -d"]. Note that cosmic-ray is one of pipenv dev dependencies.
my CI: gitlab ci with dockerized workers. For tests, workers instantiate an docker container based on a pipenv image, and the run pipenv sync -d. Inside the pipenv context (pipenv run), tests scripts are launched (with mutation tests that instanciate another virtualenv). I don't know why but packages installed into /tmp/ virtualenvs seems to be unreachable from the test command (pipvenv run seems to be different to . venv/bin/activate ...)
How about a different approach: we could add a subcommand (verb) that runs an unmutated version of the code in the context of CR, i.e. just like mutated code. If this returns successfully, you could then run your mutation suite.
This approach has at least two benefits. First, users who don't want to pay the cost for the unmutated run don't have to. Second, it removes a lot of the complexity of your PR...we don't need a magically named job, we don't need special handling for work items with no mutation point, etc.
Ultimately, what I'm describing is our old "baseline" subcommand which ran an unmutated version of the code to calculate a runtime baseline.
Yes this can be a good approach, another verb or an option for "exec". My mistake was to enter this "no_mutation" result in db. It should be better to display errors of this first run in stdout instead of trying to make stats and reports with it.
I'm not sure about the direction you've taken in this implementation; it's a bit too brute force. You've added a flag that needs to be threaded into the execution engine, and the engine needs to know to do special things when it sees that flag. One manifestation of why I think this is the wrong approach is that it requires every execution engine implementation to handle it separately. I'd prefer something that works with the current design and is as orthogonal to the rest of the system as possible.
I see think of this feature like this: let users run their tests in exactly the same environment as a mutation run - same execution engine, cloning method, etc. - but without the mutation. As such, my instinct right now is to create an operator that does no mutations. For the "no mutation" run, we could simply run a session with only that operator. Of course, we'd want to avoid using that operator in testing runs.
This approach might be tricky, and perhaps it's too clever, but it's the flavor of what I'm thinking of
On the other hand, if we don't care about completely replicating the mutation run environment, then the "no mutation" run could could bypass execution engines, clones, and all of that, simply running the test command. But it sounds like your situation - and probably that of others - is that they need to very the execution environment as much as possible.
Have a look at the no-op-operator
branch. This adds a "no-op" operator that can be used to run unmutated code using all of the machinery of CR. If the goal is to verify that your system runs correctly under CR, then I think this is about as close to that as you can get. Does this let you do what you need to do?
I have a look in the no-op-operator branch, but you will launch a run for each parso node of the project. I missed something ?
OK I will rollback asyncio changes and make a new PR for this.
I have a look in the no-op-operator branch, but you will launch a run for each parso node of the project. I missed something ?
If you use the no-op operator as a normal operator, then yes, it would apply itself to every node in the tree. However, we use it in a very special way, only ever creating a single work-item for it in the baseline
command.
Since no-op has the behavior you mentioned, we don't include it in the sequence of operator that OperatorProvider
returns. The only way to really access it is by specifically asking for it by name. I'm no 100% satisfied with this solution, but it's very unintrusive - i.e. it leverages all of the basic CR infrastructure in a "natural" way - and it's the best of the options I've been able to think of.
I will rollback asyncio changes and make a new PR for this.
In this new PR (or perhaps in an issue) I'd like you to explain why this change is needed. Is it technically superior in some way? I'm very reluctant to accept changes that only reflect a personal preference.
@Alcolo47 Does the no-op-operator branch give you what you need? If so, I can merge it in soon.
@Alcolo47 Based on #473 it seems that the no-op-operator branch works for you. If that's the case, please let me know and I'll merge it in. Then we can work on merging in #473.
OK, for no-op solution.
it's fails, we can stop now wtthout spending ours time!
Results are now stored in db as soon as they were available, not at end. (Removing pending = list(pending_work))