ASSERT-KTH / spork

AST-based structured merge tool for Java, fully Git compatible https://doi.org/10.1109/TSE.2022.3143766
MIT License
49 stars 6 forks source link

Add support for diff3 conflict style #496

Open wetneb opened 7 months ago

wetneb commented 7 months ago

Git offers the diff3 conflict style, where conflicts show not only the left and right branch, but also the base version:

<<<<<<< ours
  puts 'hola world'
||||||| base
  puts 'hello world'
=======
  puts 'hello mundo'
>>>>>>> theirs

I always use this style because it makes it much easier to understand what happened and to find the right resolution, in my experience.

It could be nice to support this in Spork too. Arguably the fact that Spork generates much more granular conflicts reduces the need for it, but still, I think it would be a nice thing to have.

I have had a quick look at implementing this and it seems doable, but I noticed an oddity: as I tried introducing a new parameter in the CLI and passing it down to the place where conflicts are formatted, I found myself having to add this parameter to intuitively unrelated classes like Parser.kt, because the printing of the conflicts is controlled by the pretty-printer set on the original AST at parsing time. It feels a bit weird: intuitively the parser should not need to know about how we want to display conflicts.

monperrus commented 7 months ago

Great idea, would you do a pull request?

wetneb commented 7 months ago

It's not actually clear to me if the algorithm used to generate the conflicts allows for generating a "base" part alongside the "left" and "right" ones.

This seems to be the place where the "left" and "right" parts are currently extracted:

https://github.com/ASSERT-KTH/spork/blob/e440feb1a25b1e37b37cd2ab632499c0028f72fa/src/main/kotlin/se/kth/spork/spoon/pcsinterpreter/SporkTreeBuilder.kt#L213-L226

Not being familiar with the PCS structure, it's not clear to me if there is a simple way to also generate the "base" part in this context. I don't even know if it actually makes sense - it's unclear to me to what extent the base revision is relied on, so it can well be that there are cases where only the left and right revisions are compared, without the ability to trace elements back to the base revision?

@monperrus are you clear on this?

slarse commented 7 months ago

@wetneb There is unfortunately no simple way to just tack on the base part of the conflict after the merge has completed. The merge algorithm works by removing so-called inconsistencies in the raw merge (which is just the set union of all three ASTs), and iteratively removes anything from the base revision that is inconsistent with either the left or right revision. So, by definition, when the merge has completed and conflicts have been resolved, the base part of any conflict is already gone.

For a brief definition of the merge algorithm, you can refer to section 2.2 of the whitepaper (a preprint is freely available on arxiv). For a significantly more in-depth explanation with worked examples, you may refer to Chapter 4 of my master's thesis (note: it's quite a long read).

I never had time to get around to even experimenting with outputting the base revision due to being short on time, but off the top of my head I can imagine two potential ways of resolving the base part of a conflict. Do note that these are ideas born from thinking through code I wrote 4 years ago, so I can't guarantee either will work in practice.

It feels a bit weird: intuitively the parser should not need to know about how we want to display conflicts.

This is the canonical way to use Spoon: set the environment first (which includes the pretty printer), and then parse, analyze/transform and pretty print with that environment. Setting of the printer requires modifications to the environment, and doing that at any time after parsing would be a bit unexpected. There also was never a reason to leave this configurable before, so there's that as well. Feel free to put up a PR with improvements.