alecthomas / chroma

A general purpose syntax highlighter in pure Go
MIT License
4.36k stars 399 forks source link

Lexer conversion from Textmate? #849

Open anthony-c-martin opened 1 year ago

anthony-c-martin commented 1 year ago

What problem does this feature solve?

I'm maintainer for Bicep, and as we are still actively developing the language, we try to keep syntax highlighting support for various libraries simple to maintain - as well as in sync with the current spec as possible.

For us, this means keeping a copy of generators for popular highlighters checked into our codebase - for example monarch, textmate and highlightjs. We validate these against code samples that we also validate compile against the latest version of the language.

I've been building a Hugo docs site, and noticed that Chroma's lexer support for Bicep is a bit outdated; I'm wondering if there's a similar process we can follow. Because our TextMate grammar is generated from this code, it's quite structured and I believe only uses a fairly limited textMate feature set. I'm curious if there are any capabilities to translate from one format to the other.

What feature do you propose?

In an ideal world, a conversion tool from TextMate to Chroma (this is probably an impossible ask!)

Barring that, a good explanation of the similarities/differences between the formats, or a conversion tool that does part of the work for you.

alecthomas commented 1 year ago

The conversion of the structure itself would not be too difficult, and naturally I've pondered this myself in the past, but the problem is that Textmate uses the oniguruma regular expression engine. Onigurama's syntax is both extremely complex and isn't compatible with anything else. So while the structural conversion might be simple, the individual patterns may or may not work.

That said, most syntax highlighters are conceptually very similar, so it should be straightforward to implement, and if the patterns aren't too complex it might be fine.

alecthomas commented 1 year ago

PS. Chroma now has an XML format for its lexers, so writing a converter to that from your existing code would be fairly simple too.