Closed pouryafard75 closed 9 months ago
Hi @pouryafard75 and thanks a lot for spotting this nasty concurrency bug that prevents parallelizing diffs.
I took a look at your solution but I am not sure it's how we should proceed. Initially, I used a global type registry to have a fast equality test for types, using only a reference check instead of a string comparison, but your solution would revert to this slower equality checking (and this is a very frequent operation).
By thinking more about the root of the problem, I think the culprit here is the type constructor (https://github.com/GumTreeDiff/gumtree/blob/main/core/src/main/java/com/github/gumtreediff/tree/TypeSet.java#L38-L40) that is not thread-safe. Maybe we could try to make this method synchronized
? Could you try this fix to see if it would work with your test?
Cheers!
@jrfaller Thanks for your feedback.
Your suggested solution works perfectly, and I have updated the PR.
Would you recommend moving the test to core
and then loading the tree from xml file? Because the test doesnt have to do anything with jdt
.
Hi @pouryafard75 and thanks for trying the fix out!
I completely agree with you that the test case should be located inside the core package, and should be as small as possible.
I think the problem appears in such a case :
So if I am not wrong, just having one-node trees true
to ==
.
WDYT?
@jrfaller Thanks for your comment. Its a bit challenging to make the test fail in case of having super small trees. Also we cannot create the Type in the setup, because that will ensure the type has been added to registry properly before any tests.
This is what I have replicated
@RepeatedTest(10)
public void testMultiThreading() {
Type foo1 = TypeSet.type("foo");
Type foo2 = TypeSet.type("foo");
assertSame(foo1,foo2);
}
I also moved it to TestTree. but the problem is, it doesn't fail frequently. I assume the previous test due to the time it took to parse the tree and generate, had higher chance of failure.
I have this one :
@Test
public void testTypeThreading() throws InterruptedException {
int n = 20;
ExecutorService exec = Executors.newFixedThreadPool(n);
List<Type> types = new ArrayList<>();
for (int i = 0; i < n; i++) {
exec.submit(() -> {
types.add(TypeSet.type("foo"));
});
}
exec.awaitTermination(1, java.util.concurrent.TimeUnit.SECONDS);
for (Type t1 : types) {
for (Type t2: types) {
assertSame(t1, t2);
}
}
}
Seems to fail almost every time on my machine, is it the case for you too?
@jrfaller Yes this works. (I was fixated to take advantage of repeated tests to reproduce the bug :grimacing:) I have updated the PR.
Hello,
I've identified a bug affecting multithreaded execution.
This PR consists of 3 commits, and I'll break down each commit:
I have created a repeated test (executes in separate threads) (commit#1) which runs GumTree Classic matcher on a file revision, and observed discrepancies in the number of generated mappings. To enable a multithreaded environment, I relied on JUnit 5's CONCURRENT execution, requiring the addition of junit-platform.properties.
Test result in a multithreaded environment before the fix
After investigating, I realized the issue lies in the equality check for types
getType() == t.getType()
, which works in a single-threaded environment but fails when multiple GumTree instances are running. In commit#2, I addressed this problem for Classic and Simple Matchers (also ZsMatcher since Classic depends on that) Test result in a multithreaded environment after the fixAdditionally, I noticed the same issue across different matchers, leading to commit#3, where I replicated the fix throughout the entire project.
P.S: I'm aware that the test should ideally be located in the core module, but due to the dependency on JdtTreeGenerator, I couldn't move it. I can export the tree XML and load it to eliminate the dependency. Please let me know your thoughts.