thdiaman / ASTExtractor

Abstract Syntax Tree Extractor for Java Source Code
Other
42 stars 17 forks source link

ASTExtractor: Abstract Syntax Tree Extractor for Java Source Code

ASTExtractor is an Abstract Syntax Tree (AST) extractor for Java source code, based on the Eclipse compiler. The tool functions as a wrapper of the Eclipse compiler and allows exporting the AST of source code files or projects in XML and JSON formats. The tool has a command line interface and can also be used as a library. The documentation is available at http://thdiaman.github.io/ASTExtractor/

Executing in Command Line mode

Execute as:

java -jar ASTExtractor.jar -project="path/to/project" -properties="path/to/propertiesfile" -repr=XML|JSON
for projects, or as:
java -jar ASTExtractor.jar -file="path/to/file" -properties="path/to/propertiesfile" -repr=XML|JSON
for java files, where -properties allows setting the location of the properties file (default is no properties so all syntax tree nodes are returned) and -repr allows selecting the representation of the tree (default is XML).

Using as a library

Import the library in your code. Set a location for the properties file using

ASTExtractorProperties.setProperties("ASTExtractor.properties");
. Then, you can use it as follows:

Using in Python

ASTExtractor also has python bindings. Using the python wrapper is simple. At first, the library has to be imported and the ASTExtractor object has to be initialized given the path to the jar of the library and the path to the properties file of the library:

ast_extractor = ASTExtractor("path/to/ASTExtractor.jar", "path/to/ASTExtractor.properties")

After that, you can use it as follows:

Note that after using the library, you have to close the ASTExtractor object using function close, i.e.:

ast_extractor.close()

Controlling the output

An Abstract Syntax Tree can be very complex, including details for every identifier of the code. In ASTExtractor, the complexity of the tree can be controlled using the ASTExtractor.properties file. In this file, the user can select the nodes that should not appear in the final tree (OMIT) and the nodes that should not be analyzed further, i.e. that should be forced to be leaf nodes (LEAF) The default options are shown in the following example ASTExtractor.properties file:

LEAF = PackageDeclaration, ImportDeclaration, ParameterizedType, ArrayType, VariableDeclarationFragment
OMIT = Javadoc, Block