Colonial-Dev / inkjet

A batteries-included syntax highlighting library for Rust, based on tree-sitter.
https://docs.rs/inkjet
Apache License 2.0
50 stars 2 forks source link
rust syntax-highlighting tree-sitter

Inkjet

A batteries-included syntax highlighting library for Rust, based on tree-sitter.

Features

Included Languages

Inkjet comes bundled with support for over seventy languages, and it's easy to add more - see the FAQ section.

Click to expand... | Name | Recognized Tokens | | ---- | ------- | | Ada | `ada` | | Assembly (generic) | `asm` | | Astro | `astro` | | Awk | `awk` | | Bash | `bash`, `sh`, `shell` | | BibTeX | `bibtex`, `bib` | | Bicep | `bicep` | | Blueprint | `blueprint`, `blp` | | C | `c`, `h` | | Cap'N Proto | `capnp` | | Clojure | `clojure`, `clj`, `cljc` | | C# | `c_sharp`, `c#`, `csharp`, `cs` | | Common Lisp | `commonlisp`, `common-lisp`, `cl`, `lisp` | | C++ | `c++`, `cpp`, `hpp`, `h++`, `cc`, `hh` | | CSS | `css` | | Cue | `cue` | | D | `d`, `dlang` | | Dart | `dart` | | Diff | `diff` | | Dockerfile | `dockerfile`, `docker` | | EEx | `eex` | | Emacs Lisp | `elisp`, `emacs-lisp`, `el` | | Elixir | `ex`, `exs`, `leex` | | Elm | `elm` | | Erlang | `erl`, `hrl`, `es`, `escript` | | Forth | `forth`, `fth` | | Fortran | `fortran`, `for` | | GDScript | `gdscript`, `gd` | | Gleam | `gleam` | | GLSL | `glsl` | | Go | `go`, `golang` | | Haskell | `haskell`, `hs` | | HCL | `hcl`, `terraform` | | HEEx | `heex` | | HTML | `html`, `htm` | | IEx | `iex` | | INI | `ini` | | JavaScript | `javascript`, `js` | | JSON | `json` | | JSX | `jsx` | | Kotlin | `kotlin`, `kt`, `kts` | | LaTeX | `latex`, `tex` | | LLVM | `llvm` | | Lua | `lua` | | GNU Make | `make`, `makefile`, `mk` | | MatLab | `matlab`, `m` | | Meson | `meson` | | Nim | `nim` | | Nix | `nix` | | Objective C | `objective_c`, `objc` | | OCaml | `ocaml`, `ml` | | OCaml Interface | `ocaml_interface`, `mli` | | OpenSCAD | `openscad`, `scad` | | Pascal | `pascal` | | PHP | `php` | | ProtoBuf | `protobuf`, `proto` | | Python | `python`, `py` | | R | `r` | | Racket | `racket`, `rkt` | | Regex | `regex` | | Ruby | `ruby`, `rb` | | Rust | `rust`, `rs` | | Scala | `scala` | | Scheme | `scheme`, `scm`, `ss` | | SCSS | `scss` | | SQL (Generic) | `sql` | | Swift | `swift` | | TOML | `toml` | | TypeScript | `typescript`, `ts` | | TSX | `tsx` | | Vimscript | `vimscript`, `vim` | | WAST (WebAssembly Script) | `wast` | | WAT (WebAssembly Text) | `wat`, `wasm` | | x86 Assembly | `x86asm`, `x86` | | WGSL | `wgsl` | | YAML | `yaml` | | Zig | `zig` |

In addition to these languages, Inkjet also offers the Runtime and Plaintext languages.

Cargo Features

"Why is Inkjet so large?"

Parser sources generated by tree-sitter can grow quite big, with some being dozens of megabytes in size. Inkjet has to bundle these sources for all the languages it supports, so it adds up. (According to loc, there are over 23 million lines of C code!)

If you need to minimize your binary size, consider disabling languages that you don't need. Link-time optimization can also shave off a few megabytes.

"Why is Inkjet taking so long to build?"

Because it has to compile and link in dozens of C/C++ programs (the parsers and scanners for every language Inkjet bundles.)

However, after the first build, these artifacts will be cached and subsequent builds should be much faster.

"Why does highlighting require a mutable reference to the highlighter?

Under the hood, Inkjet creates a tree-sitter highlighter/parser object, which in turn dynamically allocates a chunk of working memory. Using the same highlighter for multiple simultaneous jobs would therefore cause all sorts of nasty UB.

If you want to highlight in parallel, you'll have to create a clone of the highlighter for each thread. I recommend thread_local! and RefCell if you need a quick and easy solution.

"A language I want to highlight isn't bundled with Inkjet!"

Assuming that you or someone else has implemented a highlighting-ready tree-sitter grammar for the language you want, adding it to Inkjet is easy! Just open an issue asking for it to be added, linking to the grammar repository for the language.

Alternatively, you can use Language::Runtime, which will allow you to use grammars not bundled with Inkjet.

Other notes:

Building

For normal use, Inkjet will compile automatically just like any other crate.

However, if you have forked the repository and want to update the bundled languages, you'll need to use GNU Make with the included Makefile:

If, for whatever reason, you don't have GNU Make available: you can also perform these actions manually by setting the appropriate environment variables and Cargo flags:

Acknowledgements