tracehubpm / jivan

Experimental neural static analyzer for Java programs, based on ML and LLM
MIT License
2 stars 0 forks source link

Parse Java to AST #7

Open h1alexbel opened 5 months ago

h1alexbel commented 5 months ago

Let's parse .java code to AST

h1alexbel commented 5 months ago

javaparser XML AST printer:

<?xml version="1.0" ?>
<root type="CompilationUnit">
  <types>
    <type type="ClassOrInterfaceDeclaration" isInterface="false">
      <name type="SimpleName" identifier="HelloWorld"></name>
      <members>
        <member type="MethodDeclaration">
          <body type="BlockStmt">
            <statements>
              <statement type="ExpressionStmt">
                <expression type="MethodCallExpr">
                  <name type="SimpleName" identifier="println"></name>
                  <scope type="FieldAccessExpr">
                    <name type="SimpleName" identifier="out"></name>
                    <scope type="NameExpr">
                      <name type="SimpleName" identifier="System"></name>
                    </scope>
                  </scope>
                  <arguments>
                    <argument type="StringLiteralExpr" value="Hello, World!"></argument>
                  </arguments>
                </expression>
              </statement>
            </statements>
          </body>
          <type type="VoidType"></type>
          <name type="SimpleName" identifier="main"></name>
          <modifiers>
            <modifier type="Modifier" keyword="PUBLIC"></modifier>
            <modifier type="Modifier" keyword="STATIC"></modifier>
          </modifiers>
          <parameters>
            <parameter type="Parameter" isVarArgs="false">
              <name type="SimpleName" identifier="args"></name>
              <type type="ArrayType" origin="TYPE">
                <componentType type="ClassOrInterfaceType">
                  <name type="SimpleName" identifier="String"></name>
                </componentType>
              </type>
            </parameter>
          </parameters>
        </member>
      </members>
      <modifiers>
        <modifier type="Modifier" keyword="PUBLIC"></modifier>
        <modifier type="Modifier" keyword="FINAL"></modifier>
      </modifiers>
    </type>
  </types>
</root>

custom XML visitor based on Xembly:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<class id="HelloWorld">
  <modifiers>
    <modifier>public</modifier>
    <modifier>final</modifier>
  </modifiers>
  <methods>
    <method>
      main
      <modifiers>
        <modifier>public</modifier>
        <modifier>static</modifier>
        <calls>
          <call>
            println
            <arguments>
              <arguments>"Hello, World!"</arguments>
            </arguments>
          </call>
        </calls>
      </modifiers>
    </method>
  </methods>
</class>
h1alexbel commented 5 months ago

obviously we need to add types, javadocs, comments too

h1alexbel commented 5 months ago

Let's parse this one:

public class Example {
    public void method() {
        int x = 10;
    }
}

into AST like this (with a help of XML):

CompilationUnit
├── ClassDeclaration
│   ├── Identifier: Example
│   └── MethodDeclaration
│       ├── Identifier: method
│       └── Block
│           └── VariableDeclaration
│               ├── Type: int
│               └── Identifier: x

After this been done, we try to encode AST into vectors and check what model says about it