ASSERT-KTH / spork

AST-based structured merge tool for Java, fully Git compatible https://doi.org/10.1109/TSE.2022.3143766
MIT License
49 stars 6 forks source link

Unexpected and invalid merge outputs #533

Open kjy5 opened 2 months ago

kjy5 commented 2 months ago

Hi, I'm part of a research paper evaluating various merge algorithms. We noticed that while Spork performed exceptionally well, it also sometimes made unexpected and invalid outputs. We'd like to share some of these patterns with you in hopes that you may be familiar with them and know what could be causing them (we suspect Spoon and the pretty-printing process).

Our testing framework is hosted here. There are instructions in the README that explain how to reproduce our results. Specifically, at the bottom are instructions on how to replay a merge provided an ID so you may view the full context of the merge. IDs are provided in parentheses below.

This is a long list of examples and we'd appreciate any insight as to potential causes for them.

Non-compilable errors

These are merge outputs that produce invalid Java code.

Places Parentheses Incorrectly (345-203)

Expected:

Assertions.assertThat(ex.getMessage().contains(expectedError));
                                                             ^

Spork:

Assertions.assertThat(ex.getMessage()).contains(expectedError);
                                     ^

Classes With the Same Name (1322-24)

This is an imported name conflict. Expected:

protected javax.jms.Message convertToJMSMessage ...
          ^^^^^^^^^^

Spork:

protected Message convertToJMSMessage ...
          ^

This conflicted with another package that exported a class named Message. It was previously disambiguated by using javax.jms.Message.

Omission of Types (1741-4)

Expected:

} catch (Exception e) {
         ^^^^^^^^^

Spork:

} catch ( e) {
         ^

Incorrect Placement of Generics (2955-13)

Expected:

new UnsignedVariableBitLengthType( img, nBits ) );
                                        ^^^^^

Spork:

new <nBits>UnsignedVariableBitLengthType(img));
    ^^^^^^^

Dropping Escape Characters (4595-12)

Expected

return "\"";
        ^

Spork:

return """;
        ^

Merge Cascaded Conditionals (1885-445)

Expected:

} else {
    if (!daemon) {
        log.info("K3PO started (CTRL+C to stop)");
    }
    else {
        log.info("K3PO started");
    }
}

Spork:

} else if (!daemon) {
    log.info("K3PO started (CTRL+C to stop)");
} else {
    log.info("K3PO started");
}

While this one did not produce a compilation error in this particular example, it is a risky transformation.

Compilable errors

These were outputs that still produced valid Java code, but were unexpected.

Adds Parentheses (2995-13)

This one is particularly common in our findings and we wonder if it's related to the pretty printer.

Expected:

this( ( NativeImg< ?, ? extends LongAccess > ) null )

Spork:

this(((NativeImg<?, ? extends LongAccess>) (null)));
     ^                                     ^    ^^

Added Extra Semicolons (1741-4)

int to = (int) docTrees.getSourcePositions().getEndPosition(pkgTree.
getCompilationUnit(), doc, node);;
                                 ^
SearchStrategyModule stratModule = new SearchStrategyModule() {
[...]
};;;
  ^^

Swaps Qualifier Positions (4959-12)

Expected:

private final static Map<String, Integer> RULE_MAP = new HashMap<String, Integer>();
        ^^^^^ ^^^^^^

Spork

private static final Map<String, Integer> RULE_MAP = new HashMap<String, Integer>();
        ^^^^^^ ^^^^^

Thank you for your review!

monperrus commented 2 months ago

Thanks a lot for reporting those corner cases. Would you have test resources and ideally a failing test case that reproduces the bugs?

kjy5 commented 2 months ago

This repo has our testing framework: https://github.com/benedikt-schesch/AST-Merging-Evaluation/tree/main.

After you install the requirements, you can use the IDs I provided in parentheses (like 345-203) to replay that particular merge case and review the output of Spork along with the expected result.

For example, to replay the first example, run the following in the root of the repo

src/python/replay_merge.py --idx 345-203

Some more commands are listed at the bottom of the README.

Let me know if you have any questions!

monperrus commented 2 months ago

Nice testing framework! Understanding and fixing those errors would probably require some real time. Your pull-requests are most welcome :)

slarse commented 1 month ago

Hello @kjy5!

How cool, thanks for reaching out! I've been on vacation and took a complete break from GitHub, hence the rather late response. Not that I'm ever that snappy anymore :).

I'll quickly try to address each of the unexpected results and then go over this more thoroughly when I get the chance (but I really need to look at some open PRs as well, so I can't say when that'll be).

I hope this gives you some answers at least.

Non-compilable errors

Places Parentheses Incorrectly (345-203)

I really have no idea of what this one could be. I will have to investigate further.

Classes With the Same Name (1322-24)

This is probably Spoon's fault, and not unlikely by my hand. I had a large amount of headaches with package names being marked as explicit (i.e. "this qualified package name exists in the source code") when they were really implicit (i.e. "this class comes from this package but it's not written out"). This is the reverse of that, but probably related. I don't think it has anything to do with the merge, it's much more likely it's related to the printing.

Will have to investigate further to confirm.

Omission of Types (1741-4)

Most likely an issue with printing.

Incorrect Placement of Generics (2955-13)

These aren't even the same ASTs. Most likely an issue with building the Spoon AST from the merged PCS set.

Dropping Escape Characters (4595-12)

Almost guaranteed to be a printing issue.

Merge Cascaded Conditionals (1885-445)

This is definitely an issue with implicit elements (like with the package name). See the documentation for Spoon's CtIf class.

Can't really say how the if block gets marked as implicit when it shouldn't be, will have to run the merge and debug it.

Compilable errors

Adds Parentheses (2995-13)

This is the behavior of the default printer in Spoon (which Spork's printer is built on top of). See this issue: https://github.com/INRIA/spoon/issues/3809

Added Extra Semicolons (1741-4)

This is probably an issue with Spork's formatting-preserving printing (which is fancy speak for "copy/paste") doing double work with the default printer. I had issues with source code positions from Eclipse JDT (which is what Spoon is built on) sometimes including semicolons (and other punctuation), and sometimes not. It was one of the primary reasons I did not move forward with arbitrarily granular formatting-preserving printing, and kept it on a pretty coarse level.

Swaps Qualifier Positions (4959-12)

Default pretty printer behavior, pretty hard to do anything about.