AndroidIDEOfficial / android-tree-sitter

Tree Sitter for Android
GNU Lesser General Public License v2.1
51 stars 17 forks source link
android android-ide androidide hacktoberfest java jni tree-sitter

android-tree-sitter



Android Java bindings for tree-sitter. Currently used in AndroidIDE and sora-editor.

Add to your project

Latest maven release

// main library
implementation 'com.itsaky.androidide.treesitter:android-tree-sitter:<version>'

// grammar libraries
// <language> is the name of the language grammar. e.g. 'java', 'python', 'xml', etc.
implementation 'com.itsaky.androidide.treesitter:tree-sitter-<language>:<version>'

The following grammars have been published to Maven central :

Building

Prerequisites

As tree-sitter is already included in the source (as a submodule), the tree-sitter-cli is built from source and then used to build the grammars.

Read the documentation on how to install pre-built versions of tree-sitter.

If you want to use a prebuilt tree-sitter binary (either manually installed or installed with npm or cargo), make sure it is accessible in the PATH, then set the TS_CLI_BUILD_FROM_SOURCE environment variable :

export TS_CLI_BUILD_FROM_SOURCE=false

IMPORTANT: Building on a Linux machine is recommended.

Get source

Clone this repo with :

git clone --recurse-submodules https://github.com/AndroidIDEOfficial/android-tree-sitter

Install grammar dependencies

You might need to install the Node packages required by the grammars. For example, the cpp grammar requires you to install the packages:

# from the root directory of this project
cd grammars/cpp && npm install && cd -

Build

A normal Gradle build (./gradlew build) can be executed in order to build everything for Android and the host OS. To build android-tree-sitter and the grammars only for the host OS, you can execute buildForHost task on appropriate subprojects.

Adding grammars

The Gradle modules for the grammars are almost identical, with only minor differences in the CMakeLists file and the Java binding class.

These Gradle modules are automatically generated by the DynamicModulePlugin. The generation process relies on the grammars.json file. More information about the structure of this JSON file can be found in the README under the grammars directory.

Apart from the DynamicModulePlugin, there are other Gradle plugins which are used to configure and build the grammars effectively.

The common configuration for the grammars can be found in the build.gradle.kts file. This is where you can make changes or adjustments to the module configuration that applies to all grammars.

The generated modules are located in the rootDir/grammar-modules directory. This is where you can find the output of the module generation process.

To add a new grammar to the project, follow these steps:

  1. Begin by adding the grammar source code to the grammars directory. To accomplish this, you can add a submodule using the following command:

    git submodule add <remote_url> grammars/<language_name>
  2. The language_name should be the simple name of the language, without the tree-sitter- prefix. This name is used to generate both the shared library and the Gradle module. For example, if the language_name is abc:

    • The module tree-sitter-abc will be automatically generated.
    • The name of the resulting shared library will be libtree-sitter-abc.so.
  3. After adding the grammar source, update the grammars.json file to include the newly added grammar in the project.

  4. Finally, sync the project to trigger the generation of the module for the newly added grammar.

Loading external grammars

You have two ways to load grammars that are not published along with this project :

TSLanguage.loadLanguage uses dlopen to load the library and must be CAREFULLY used. Also, grammars that are loaded using this method must be closed when they are not used.

Prefer using the first method whenever possible.

Package the grammar with your application

You can package the grammar in your Android application as you would package any other shared library :

package com.my.app;

public class MyClass {

  static {
    System.loadLibrary("tree-sitter-myLang");
  }

  public static native long myLang();
}
extern "C" TSLanguage *tree_sitter_myLang();

extern "C"
JNIEXPORT jlong JNICALL
Java_com_my_app_MyClass_myLang(JNIEnv *env, jclass clazz) {
  // simply cast the language pointer to jlong
  return (jlong) tree_sitter_myLang();
}
final TSLanguage myLang=TSLanguage.create("myLang",MyClass.myLang());

// use it with TSParser
  try(final var parser=TSParser.create()){
  parser.setLanguage(myLang);
  ...
  }

Load grammars at runtime

android-tree-sitter v3.1.0 or newer is required for this method.

TSLanguage provides loadLanguage(String, String) method which can be used to load the grammars at runtime. This method uses dlopen to load the shared library, get the language instance and return its pointer. Use this method CAREFULLY.

The language instances created using this method MUST be closed using TSLanguage.close(). Calling the close method ensures that the underlying dlopen'ed library handle is closed using dlclose.

Usage :

// provide the path to the shared library and the name of the language
// the name is used to cache the language instance
// further invocations of this method with the same lang name returns the
// cached language instance
final TSLanguage myLang=TSLanguage.loadLanguage("/path/to/libtree-sitter-myLang.so","myLang");

  if(myLang!=null){
  // loaded successfully
  }else{
  // failed to load the language
  // see logcat for details
  }

Use this language :

try(final var parser=TSParser.create()){
  parser.setLanguage(myLang);
  ...
  }

You don't have to keep a reference to myLang. Once loaded, the language can be accessed using TSLanguageCache :

// returns the 'myLang' instance i.e. both are same
final TSLanguage cachedMyLang=TSLanguageCache.get("myLang");

DO NOT FORGET to close the language :

// this closes the underlying library handle
myLang.close();

Examples

For examples, see the tests.

License

android-tree-sitter library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

android-tree-sitter library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU General Public License
along with android-tree-sitter.  If not, see <https://www.gnu.org/licenses/>.