GumTreeDiff / gumtree

An awesome code differencing tool
https://github.com/GumTreeDiff/gumtree/wiki
GNU Lesser General Public License v3.0
933 stars 174 forks source link

feat: Add comments to JdtVisitor #368

Closed pouryafard75 closed 2 months ago

pouryafard75 commented 2 months ago

Hello, This is a fix for a very old issue https://github.com/GumTreeDiff/gumtree/issues/39.

The implementation might not be the best one, and I am very eager to improve it based on your feedbacks.

You can enable/disable the feature throughout the JdtTreeGenerator. By default its disabled since you might have a test environment tailored to the previous structure.

    public JdtTreeGenerator() { 
        this(false);
    }

    public JdtTreeGenerator(boolean includeComments) {
        super();
        this.includeComments = includeComments;
    }

For the following code example:

class bar {
        void foo(/*int a*/)
        {
                //run();
        }
}

The generated tree will be as follows:

CompilationUnit [0,87]
                        TypeDeclaration [0,86]
                            TYPE_DECLARATION_KIND: class [0,5]
                            SimpleName: bar [6,9]
                            MethodDeclaration [20,84]
                                PrimitiveType: void [20,24]
                                SimpleName: foo [25,28]
                                BlockComment: /*int a*/ [29,38]
                                Block [48,84]
                                    LineComment: //run(); [66,74]
tsantalis commented 2 months ago

@pouryafard75

Thank you! I have a question about findMostInnerEnclosingParent()

In case the comment is within a statement as follows, does findMostInnerEnclosingParent() return the return statement as parent, or the expressions within the statement, or the parent method declaration body?

return outerInstance.isPresent() //
    ? executableInvoker.invoke(constructor, outerInstance.get(), extensionContext, registry) //
    : executableInvoker.invoke(constructor, extensionContext, registry);

https://github.com/junit-team/junit5/commit/180df5a92#diff-3679a3039f048d756df1feebd214cc17df7b0a686dc3be77f24bff5bc61bbf4cR290-R291

https://github.com/junit-team/junit5/commit/180df5a92#diff-3679a3039f048d756df1feebd214cc17df7b0a686dc3be77f24bff5bc61bbf4cR333-R334

pouryafard75 commented 2 months ago

@tsantalis Very good question!

I tried the following code snippet:

class bar {
      void foo()
            {
            return outerInstance.isPresent() //alpha
                ? executableInvoker.invoke(constructor, outerInstance.get(), extensionContext, registry) //beta
                : executableInvoker.invoke(constructor /* block */ , extensionContext, registry); //gomma
            }
          }

And this is the generated tree:

    TypeDeclaration [0,311]
        TYPE_DECLARATION_KIND: class [0,5]
        SimpleName: bar [6,9]
        MethodDeclaration [20,309]
            PrimitiveType: void [20,24]
            SimpleName: foo [25,28]
            Block [39,309]
                ReturnStatement [41,291]
                    ConditionalExpression [48,290]
                        MethodInvocation [48,73]
                            METHOD_INVOCATION_RECEIVER [48,61]
                                SimpleName: outerInstance [48,61]
                            SimpleName: isPresent [62,71]
                        LineComment: //alpha [74,81]
                        MethodInvocation [100,186]
                            METHOD_INVOCATION_RECEIVER [100,117]
                                SimpleName: executableInvoker [100,117]
                            SimpleName: invoke [118,124]
                            METHOD_INVOCATION_ARGUMENTS [125,185]
                                SimpleName: constructor [125,136]
                                MethodInvocation [138,157]
                                    METHOD_INVOCATION_RECEIVER [138,151]
                                        SimpleName: outerInstance [138,151]
                                    SimpleName: get [152,155]
                                SimpleName: extensionContext [159,175]
                                SimpleName: registry [177,185]
                        LineComment: //beta [187,193]
                        MethodInvocation [212,290]
                            METHOD_INVOCATION_RECEIVER [212,229]
                                SimpleName: executableInvoker [212,229]
                            SimpleName: invoke [230,236]
                            METHOD_INVOCATION_ARGUMENTS [237,289]
                                SimpleName: constructor [237,248]
                                BlockComment: /* block */ [249,260]
                                SimpleName: extensionContext [263,279]
                                SimpleName: registry [281,289]
                LineComment: //gomma [292,299]
tsantalis commented 2 months ago

OK, I think it makes sense to have it like that from the AST point of view. RefactoringMiner keeps all comments in the parent method declaration, but this is more accurate.

tsantalis commented 2 months ago

@jrfaller I reviewed the PR and it is ready to be merged. It would be a nice feature for version 4 to support inline comments.

jrfaller commented 2 months ago

Hi! Thanks a lot for the PR! I realize that activating the comments is done via a boolean on the constructor of JdtGenerator and I am afraid this way it won't be possible to use it from the command line. To overcome this may be the includeComments boolean should be set via a system property like it's done for instance in https://github.com/GumTreeDiff/gumtree/blob/main/core/src/main/java/com/github/gumtreediff/matchers/heuristic/gt/GreedyBottomUpMatcher.java WDYT ? Another solution would be not to have an option to activate / deactivate comments diffing and having them diffed by default? Cheers!

tsantalis commented 2 months ago

@jrfaller @pouryafard75 I agree comments should be diffed by default. We don't need this boolean in the constructor. Comment diff is a very important missing feature.

pouryafard75 commented 2 months ago

@jrfaller I made it work with gumtree.jdt.includeComments system property with the default value of true. Is this fine?

pouryafard75 commented 2 months ago

@jrfaller In case you have missed the previous msg. The PR is ready.

pouryafard75 commented 2 months ago

I have updated this one, to process non-attached javadocs too. getCommentList on CompliationUnit, gives the list of all objects from JavaDoc, LineComment and BlockComment types. In case the JavaDoc doesnt have any parent, it means that its not attached to any program element. I added a test to address the issue.

tsantalis commented 2 months ago

@pouryafard75 None of the comments given by CompilationUnit.getCommentList has a parent. All of them are detached from the AST.

Some of them could be actual javadocs of methods, classes, fields, etc.

It is very rare to have Javadocs within the body of a method as shown below https://github.com/hibernate/hibernate-orm/commit/2176af114#diff-29cf6a32b665d1c14bb90e991cd9f9778cba6e9deee05baa92756bffb8f6ca28R1962-R1966

I think only these cases should be processed from the comment list, because all other normal Javadoc cases already exist as properties in the method, field, type declaration AST nodes.

pouryafard75 commented 2 months ago

@tsantalis Attached JavaDocs actually have parents. Thats the way to distinguish them:

image
tsantalis commented 2 months ago

@pouryafard75 Yes you are right. The attached javadocs to method, field, type declarations have a parent, while those inside method bodies have a null parent.

Your PR is extensively tested and ready to be merged. @jrfaller Please go ahead with the merge. We checked everything in detail.

jrfaller commented 2 months ago

Hi @tsantalis, @pouryafard75!

Thanks a lot for improving the PR. Sorry to be a little over-careful with this one but it's on the JDT parser so it can have a lot of effect :)

I have tried the option mechanism and I think it's not very convenient to use. By reading the code base again I think that finally the easiest way to have both ways to generate the tree would be to have a new TreeGenerator, similar to what I did for the change distiller tree generator here: https://github.com/GumTreeDiff/gumtree/blob/main/gen.jdt/src/main/java/com/github/gumtreediff/gen/jdt/cd/CdJdtTreeGenerator.java

This way it can easily be set up using the -g option in the command line. Sorry not to have spotted this sooner.

@pouryafard75 could you try it?

Cheers!

pouryafard75 commented 2 months ago

@jrfaller What do you think about this new impl?

pouryafard75 commented 2 months ago

I have written separate Visitor and Generator for it. I think now its ready and functional.

jrfaller commented 2 months ago

Perfect! Looks very nice! Thanks @pouryafard75 and @tsantalis !!!

monperrus commented 2 months ago

@pouryafard75 @tsantalis ICYMI, see also https://github.com/SpoonLabs/gumtree-spoon-ast-diff/ Gumtree-diff for Java with the full power of the Spoon API.

tsantalis commented 2 months ago

@monperrus Martin, thanks for the pointer. We will check if it is possible to add gumtree-spoon in our AST diff benchmark.

tsantalis commented 2 months ago

@jrfaller When should we expect the next Gumtree release?

jrfaller commented 1 month ago

Hi @tsantalis ! I have just released 4.0.0-beta3 if I don't run into any bug with it should be soon released as 4.0.0!