Closed SNielebock closed 4 years ago
We did a further analysis of that bug. Attached you find our testbed (Main.java.txt) as well as the Java file which reveals that behavior (4.java.txt).
We found out that:
AUGBuilder
generates different AUGs and therefore different patterns (in terms of numbers and types).For us this looks like a classical Heisenbug somewhere in the construction of the AUG.
We also observed this behavior on two independent computers, both running Windows 10.
Hi,
thank you very much for reporting and tracing this bug.
We had problems with indeterministic AUG generation is the past. Seems we didn't rule that out entirely yet... unfortunately, I will not be able to debug this anytime soon. Can you work around this by using serialization for the time being? I realize this is far from optimal...
Meanwhile, the mining is expected to be non-deterministic. For scalability reason, we do not perform an exhaustive search of all subgraphs, but greedily extend pattern candidates (no backtracking). This implies we might miss patterns. Since AUGs use Java HashSets to store edges, and the mining explores AUGs along edges, it might select neighbor nodes for extension in a different order in a different run and, thus, end up with different patterns. The same is true for the detection.
For a description of both algorithms, please check the preprint of our paper, which just became available.
We tried once to root out this source of non-determinism by using sets with stable order, but it turns out we already get things with non-deterministic order from the JDT (which we use for parsing and resolving the source code), such that we cannot entire control this.
Closing, since no reply since 6 months.
When running the current version of the AUGMiner
(0.0.3-SNAPSHOT)
, we observed non-deterministic mining results, i.e, different patterns as well as different numbers of patterns, while the input data, i.e., the source files, and the parameters of the Miner remains the same.A minimal example is in the attached file 4.txt with a single method.
We generate the AUGs as follows:
However, we also load the AUGs from previously serialized ones. Thus, we do not think that the non-deterministic results are caused by a non-deterministic AUG generation.
We configure the miner as follows:
We also set the
edu.iastate.cs.egroum.dot.DotGraph.EXEC_DOT
variable to the respective dot-executable.The miner is then executed via
We observed the non-deterministic behavior under Windows as well as under Ubuntu. Is this non-deterministic behavior of the miner expected or is this an issue on how we applied the miner.