kotlinx / ast

Generic AST parsing library for kotlin multiplatform
Apache License 2.0
326 stars 23 forks source link

Best way to explore AST to search for specific patterns? #33

Open ghost opened 3 years ago

ghost commented 3 years ago

Hello,

i am trying to migrate an existing project using kastree to this library.

I am struggling to retrieve the content of a parameter of an annotation that is on a method of a class.

I have a Kotlin file that looks like this (excerpt):

class MyClass {

@KafkaListener(
        id = "\${'$'}{messaging.command.topic.consumer.group.name}",
        clientIdPrefix = "\${'$'}{messaging.command.topic.consumer.group.name}",
        topics = ["direct.topic.name.2", "\${'$'}{messaging.command.topic.name.2}"],
        concurrency = "\${'$'}{messaging.command.topic.listener-count}"
    )
    fun topicTest4MultipleMixedTopics(@Payload entityCommand: EntityCommand<JsonNode>, record: ConsumerRecord<String, Array<Byte>>) {
    }
}

What's the best way to get the content of the topics argument of the @KafkaListener annotation ?

So far i came up with this. This gives me the members of the class:

kotlinFile.summary(attachRawAst = false)
            .onSuccess { ast ->
                ast
                    .filterIsInstance<KlassDeclaration>() // filter on Classes
                    .flatMap { it.flatten("classBody") } // get the Class body
                    .flatMap { it.children } // get all the declarations within that class (functions etc)
                    .filterIsInstance(KlassDeclaration::class.java) // filter on KlassDeclaration
                    .flatMap { parseTopics2(it) } // parse topics from functions
            }

This tries to parse the function declaration block. I am faced with each node having a single node in its children, over and over, and no good way to get the content of the actual string.

private fun parseTopics2(func: KlassDeclaration): List<Pair<String, List<Schema>>> {
        func.children
            .asSequence()
            .filterIsInstance<KlassAnnotation>()
            .filter { it.description.contains(annotation) }
            .mapNotNull { it.arguments.firstOrNull { it.identifier.identifier == "topics" } }
            .mapNotNull { it.expressions.firstOrNull() }
            .filter { it.description == "collectionLiteral" }
            .filterIsInstance<DefaultAstNode>()
            .mapNotNull { it.children.getOrNull(1) }
            .toList()
            .flatMap { it.flatten("stringLiteral") }
        return emptyList()
    }

I am also not sure that doing a .filterIsInstance<KlassAnnotation>() is a good way of filtering the AST, surely there is a better way of doing that, no ?

drieks commented 3 years ago

Hi @gauthier-roebroeck-mox, KlassDeclaration has a Member called annotations. When you have the func: KlassDeclaration you can write func.annotations to get a list of all annotations. You can then compare the identifier to check if the given name of the annotation is KafkaListener. When this is the case, you can write

anno.arguments.find { argument ->
  argument.identifier.identifierName() == "topics"
}

(import kotlinx.ast.common.klass.identifierName)

Reading the value is not so easy because currently only parsing of top level stuff is implemented. When you have the argument, you can use children to read the value. in this case, there is only one children with descrption "collectionLiteral" . this has again six children:

0 = {DefaultAstTerminal@4161} DefaultAstTerminal(description=LSQUARE, text=[, channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   65 [203..204]   [5:18..5:19]}))
1 = {DefaultAstNode@4162} DefaultAstNode(description=expression, children=[DefaultAstNode(description=disjunction, children=[DefaultAstNode(description=conjunction, children=[DefaultAstNode(description=equality, children=[DefaultAstNode(description=comparison, children=[DefaultAstNode(description=genericCallLikeComparison, children=[DefaultAstNode(description=infixOperation, children=[DefaultAstNode(description=elvisExpression, children=[DefaultAstNode(description=infixFunctionCall, children=[DefaultAstNode(description=rangeExpression, children=[DefaultAstNode(description=additiveExpression, children=[DefaultAstNode(description=multiplicativeExpression, children=[DefaultAstNode(description=asExpression, children=[DefaultAstNode(description=prefixUnaryExpression, children=[DefaultAstNode(description=postfixUnaryExpression, children=[DefaultAstNode(description=primaryExpression, children=[DefaultAstNode(description=stringLiteral, children=[DefaultAstNode(description=lineStringLiteral, children=[DefaultAstTerminal
2 = {DefaultAstTerminal@4163} DefaultAstTerminal(description=COMMA, text=,, channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   69 [225..226]   [5:40..5:41]}))
3 = {DefaultAstTerminal@4164} DefaultAstTerminal(description=Inside_WS, text= , channel=AstChannel(id=1, name=HIDDEN), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   70 [226..227]   [5:41..5:42]}))
4 = {DefaultAstNode@4165} DefaultAstNode(description=expression, children=[DefaultAstNode(description=disjunction, children=[DefaultAstNode(description=conjunction, children=[DefaultAstNode(description=equality, children=[DefaultAstNode(description=comparison, children=[DefaultAstNode(description=genericCallLikeComparison, children=[DefaultAstNode(description=infixOperation, children=[DefaultAstNode(description=elvisExpression, children=[DefaultAstNode(description=infixFunctionCall, children=[DefaultAstNode(description=rangeExpression, children=[DefaultAstNode(description=additiveExpression, children=[DefaultAstNode(description=multiplicativeExpression, children=[DefaultAstNode(description=asExpression, children=[DefaultAstNode(description=prefixUnaryExpression, children=[DefaultAstNode(description=postfixUnaryExpression, children=[DefaultAstNode(description=primaryExpression, children=[DefaultAstNode(description=stringLiteral, children=[DefaultAstNode(description=lineStringLiteral, children=[DefaultAstTerminal
5 = {DefaultAstTerminal@4166} DefaultAstTerminal(description=RSQUARE, text=], channel=AstChannel(id=0, name=DEFAULT_TOKEN_CHANNEL), attachments=AstAttachments(attachments={kotlinx.ast.common.ast.AstAttachmentAstInfo@4944de48=   77 [268..269]   [5:83..5:84]}))

you can try to call summary again on all children to get an easier to use ast.

please let me know if you have any questions.

ghost commented 3 years ago

The best I could come up with is this, not very pretty:

func.annotations
            .mapNotNull { annotation -> annotation.arguments.firstOrNull { it.identifier?.identifier == "topics" } }
            .mapNotNull { it.expressions.firstOrNull() }
            .flatMap { it.flatten("lineStringContent") }
            .flatMap { it.children }
            .filter { it.description == "LineStrText" }
            .filterIsInstance<DefaultAstTerminal>()
            .map { it.text }

Is it possible to use something in kotlinx.ast.common.filter to make it better ? I couldn't find any doc on those TreeFilter.

drieks commented 3 years ago

Hi @gauthier-roebroeck-mox,

yes, sadly I have only little time to work on this library and I prefer to add new functionality, so there is almost no documentation. TreeFilter is mainly used as an internal API by TreeMapper. This is something like map/flatMap on a AST-Structure. If you are interested in an example: https://github.com/kotlinx/ast/blob/816803892c3b4558a8036d4220d9326c47b14b01/grammar-kotlin-parser-common/src/commonMain/kotlin/kotlinx/ast/grammar/kotlin/common/summary/KotlinTreeMapBuilder.kt#L1518-L1532 This Code will convert the annotation from your example, parsed into this ast: https://github.com/kotlinx/ast/blob/816803892c3b4558a8036d4220d9326c47b14b01/grammar-kotlin-parser-test/src/commonMain/resources/testdata/Issue33.raw.ast.txt#L23-L257 into this summary: https://github.com/kotlinx/ast/blob/816803892c3b4558a8036d4220d9326c47b14b01/grammar-kotlin-parser-test/src/commonMain/resources/testdata/Issue33.summary.ast.txt#L6-L18

   // this call will add a new defintion to the TreeMapper
    .convert(
        // filter byDescription will select all AST Nodes (and also terminal symbols) with the given description
        filter = byDescription("annotation")
   // select only AstNodes (a node with possible child nodes), no AstTerminal (a leaf node containing text)
    ) { node: AstNode ->
        // use the same TreeMapper to convert all children of this node
        recursiveFlatten(node).flatMap { result ->
            // continue the tree mapping,
            astContinue(
                // replace the node `node` with this KlassAnnotation
                KlassAnnotation(
                    // result the summary of all children nodes,
                    // filter out all KlassIdentifer
                    identifier = result.filterIsInstance<KlassIdentifier>(),
                    // ...and all KlassDeclarations
                    arguments = result.filterIsInstance<KlassDeclaration>()
                )
            )
        }
    }

I hope this helps, feel free to ask me any questions. Sadly, I have no time to add functionality to work in a better way with the ast nodes. In my private project (for which I'm developing this library here) I'm doing code generation (using https://github.com/square/kotlinpoet for writing generated code) and kotlinx.ast for parsing the source. The first step after kotlinx.ast is filtering the annotations I'm interested in (and the annotated subjects), I'm storing this information in a set of data classes. This is very similar to the code you provided here. In the last step, I'm converting this data class into generated code using kotlinpoet.