SpoonLabs / astor

Automatic program repair for Java with generate-and-validate techniques :v::v:: jGenProg (2014) - jMutRepair (2016) - jKali (2016) - DeepRepair (2017) - Cardumen (2018) - 3sfix (2018)
https://hal.archives-ouvertes.fr/hal-01321615/document
GNU General Public License v2.0
208 stars 106 forks source link

Cardumen different behaviour with same seed #296

Closed MaximoOliveira closed 3 years ago

MaximoOliveira commented 3 years ago

Hello,

I've ran the following test case testCardumentM70Evolve() (from CardumenApproachTest class) a couple of times and seem to have 2 different outputs every few times, despite having the same seed in the test case (seed = 400)

For example sometimes i get the following output:

General stats:
EXECUTION_IDENTIFIER=
TOTAL_TIME=187.866
NR_GENERATIONS=148
NR_RIGHT_COMPILATIONS=136
NR_FAILLING_COMPILATIONS=12

and other times the following:

General stats:
EXECUTION_IDENTIFIER=
TOTAL_TIME=40.92
NR_GENERATIONS=74
NR_RIGHT_COMPILATIONS=65
NR_FAILLING_COMPILATIONS=9

Which means sometimes the solution is found at generation 148 and other times at generation 74. Is this behaviour expected? I thought that the output should always be the same when using the same seed and parameters.

Any idea why this might happen? Thank you for your help!

martinezmatias commented 3 years ago

Hi @MaximoOliveira

Thanks for reporting the issue.

It seems Cardumen behaves non deterministic manner. (We have realized about that issue as the CI has a flaky test that expose this issue).

It would be nice to find the code that produces this non deterministic behaviour.

MaximoOliveira commented 3 years ago

Thank you for your response @martinezmatias !

I will see if I can detect what causes the non deterministic behaviour

martinezmatias commented 3 years ago

I will see if I can detect what causes the non deterministic behaviour Great @MaximoOliveira , It will be nice to find it! Tell me if you need some help.

Thanks! Matias

MaximoOliveira commented 3 years ago

Hello @martinezmatias ,

I believe I have made some progress with this Heisenbug

After running testCardumentM70Evolve() a couple of times I detected that the solution is either found at generation 74 or 148. At no other generation a solution was found.

The solution for this test case is to change the expression solve(min, max) by the expression solve(f, min, max) at BisectionSolver (line72)

When the solution is found at generation 74 it is because the correct ingredient is at index 0 from the List ingredientsAfterTransformation. image

When for some reason the solution is only found at generation 148 (i.e. at generation 74 an ingredient other than solve(f,min,max) was picked) we can see that it is due to the List ingredientsAfterTransformation being different: image

Note that the ingredients at index 0 are different in the the 2 lists ingredientsAfterTransformation, and they should be the same. Because the second execution has the ingredient solve(f, max, min) at index 0 instead of solve(f, min,max), it will continue its search for solutions.

This list is sorted by highest probability of VarCombination and both combinations in this case have the same probability (0.16). image

I do not know why in some cases the List prioritizes one Ingredient over the other (when both have the same probability). A fix for this could be to re-sort this list before picking an ingredient ( After being sorted by probability it could be sorted alphabetically) this would ensure the list to always be in the same order.

I hope this was helpful to understand the bug better.

Let me know what you think

Thank you!

martinezmatias commented 3 years ago

Hi @MaximoOliveira

Thanks a lot for the detailed analysis!

A fix for this could be to re-sort this list before picking an ingredient ( After being sorted by probability it could be sorted alphabetically) this would ensure the list to always be in the same order.

I completely agree with you.

If I am not wrong, in that case, the order of that list of ingredients ingredientsAfterTransformation is sorted by the probability of variables that conform that ingredient. Those the set of vars that can be assigned to an ingredient are sorted here: https://github.com/SpoonLabs/astor/blob/7bb0810aba97031d538831ccc7f6d90799ff722b/src/main/java/fr/inria/astor/core/solutionsearch/spaces/ingredients/transformations/ProbabilisticTransformationStrategy.java#L184 However, I think that once those set of vars are sorted, the list ingredientsAfterTransformation is not sorted any more, e.g., https://github.com/SpoonLabs/astor/blob/7bb0810aba97031d538831ccc7f6d90799ff722b/src/main/java/fr/inria/astor/core/solutionsearch/spaces/ingredients/transformations/ProbabilisticTransformationStrategy.java#L116 We can force to re-sort the list then the list is created i.e., method transform from classIngredientTransformationStrategy. WDYT?

Thanks! Matias

MaximoOliveira commented 3 years ago

Hello @martinezmatias

Yes to re-sort it at method transform seems to be the best way!

I can implement the changes and open a pr today

martinezmatias commented 3 years ago

Hi @MaximoOliveira

I can implement the changes and open a pr today

Great, thanks a lot!