ASSERT-KTH / spork

AST-based structured merge tool for Java, fully Git compatible https://doi.org/10.1109/TSE.2022.3143766
MIT License
51 stars 8 forks source link

Try using gumtree-simple matcher in Spork instead of gumtree-classic #493

Open jrfaller opened 8 months ago

jrfaller commented 8 months ago

Since GumTree's version 3.0 we have a new faster and more accurate matcher, called gumtree-simple. I think it could be a nice thing to try using it inside Spork instead of gumtree-classic as the base matcher and perhaps also the left-right matcher? I am not sure where in the code the matchers are instantiated, but I think it should be easy to do.

WDYT?

Cheers!

monperrus commented 8 months ago

it's a great idea!

jrfaller commented 8 months ago

OK found the code here -> https://github.com/ASSERT-KTH/spork/blob/master/src/main/kotlin/se/kth/spork/spoon/Spoon3dmMerge.kt#L355-L365 therefore I can easily make a patch to update to gumtree-simple. However, it would be nice to know whether or not it improves merging time and/or textual similarity to the gold set compared to using classic and XY. Do you have any idea how to do that?

slarse commented 8 months ago

Hey @jrfaller,

That's an interesting idea. Swapping out left/right matcher is almost guaranteed to cause trouble failures as everything is very finely tuned for the X/Y matcher. I chose the X/Y matcher because it was less prone to spurious left/right matchings (which cause massive trouble). The default matcher caused way too many spurious matchings and was very impractical to use for that.

The base matcher might be OK to swap out, but it also depends on how Gumtree-Spoon has changed since Spork was created. Spork makes heavy use of Gumtree-Spoon's internal behavior, I think the most blatant example of this is where it infers additional mappings (see https://github.com/ASSERT-KTH/spork/blob/7672b762285fafb520e45ebf0fcfae3fe014c000/src/main/kotlin/se/kth/spork/spoon/matching/SpoonMapping.kt#L50-L61).

The benchmark suite should be viable to do some quick and dirty evaluations. The latest version is in the replication package (download https://github.com/ASSERT-KTH/spork/releases/download/v0.5.1/replication_package.tar.gz, the benchmark suite is in replication_package/software/benchmark-scripts, README explains from there). I'm not sure why I never checked that into the repository.

There's also a very outdated version of the benchmark suite that for reasons along the lines of "I never got around to it" still runs in CI, see e.g. https://github.com/ASSERT-KTH/spork/actions/runs/7715093311/job/21028824279. But I would go with the one in the replication package.

For a comprehensive evaluation I think just running the experiments again would be the least hands-on thing to do (see https://github.com/ASSERT-KTH/spork/tree/master/replication#setting-up-for-the-experiments).

monperrus commented 2 weeks ago

OK found the code here -> master/src/main/kotlin/se/kth/spork/spoon/Spoon3dmMerge.kt#L355-L365 therefore I can easily make a patch to update to gumtree-simple

give it a try @jrfaller !

jrfaller commented 2 weeks ago

OK I am preparing a pull request, we will see how to run a benchmark to see whether or not it is useful! Cheers!

jrfaller commented 2 weeks ago

Started the WIP in #534 you can take a look