ptpb / pb

pb is a formerly-lightweight pastebin and url shortener
Other
549 stars 52 forks source link

TOML syntax highlighting #221

Closed kaushalmodi closed 5 years ago

kaushalmodi commented 5 years ago

Can we have syntax highlighting for TOML? There are many TOML parser implementations if they can serve as the proof of popularity. The aim is to have links like https://ptpb.pw/cg9R (small TOML snippet) be syntax highlighted using https://ptpb.pw/cg9R/toml.

For example, it looks like this in Emacs:

image

buhman commented 5 years ago

Related: https://bitbucket.org/birkenfeld/pygments-main/issues/1150/new-lexer-request-toml-rust

buhman commented 5 years ago

This is not as trivial as implied.

TOML parser implementations

Actually these are TOML codecs, which aren't directly useful if they don't provide a standalone-usable (much less documented) parser/lexer interface.

The tokens emitted by that lexer also would need to use pygments' token model (or be translated).

buhman commented 5 years ago

This looks directly usable, if valid to begin with:

https://github.com/liluo/pygments-github-lexers/blob/master/pygments_github_lexers/github.py

buhman commented 5 years ago

I integrated pygments_github_lexers for toml and puppet in custom-lexers. Unfortunately, github crashes while attempting to generate a diff for that, so I'm unable to create a pull request (you can look at https://github.com/ptpb/pb/commit/b8755dbb8c267cf17aee567854037547e6a43d69 though).

More importantly, that lexer is completely garbage, and looks awful:

Compare that with how the cfg lexer parses this:

buhman commented 5 years ago

As a consquence, I'm going to close this as:

As all of these are applicable.

buhman commented 5 years ago

On the other hand, if you really want the first version as-is, I'm happy to merge terrible lexers.

kaushalmodi commented 5 years ago

I agree that that lexer is terrible, but at least it is not highlighting things incorrectly. The cfg (ini?) lexer works for the simple example. But https://ptpb.pw/cg9R/cfg looks really bad:

image

Does that incomplete toml lexer look that bad?

buhman commented 5 years ago

kaushalmodi commented 5 years ago

@buhman I think that that lexer looks nice. The only issue is that it's not setting the color of keys differently.. but it recognizes strings, comments, keys with indentation, etc.

buhman commented 5 years ago

I haven't actually read the spec, but how does this look?:

kaushalmodi commented 5 years ago

That looks nice!

kaushalmodi commented 5 years ago

If you like to add a full fledged test, here's a larger toml snippet: https://ptpb.pw/WwKU

buhman commented 5 years ago

https://ptpb.pw/WwKU/toml it actually has some serious problems, no doubt from not properly using contexts, though at least keys and literals are highlighted ~consistently.

kaushalmodi commented 5 years ago

it actually has some serious problems, no doubt from not properly using contexts, though at least keys and literals are highlighted ~consistently.

I was going to say that it looks sweet! What serious problems do you see? The only issue I spotted was true not getting highlighted on this line --> https://ptpb.pw/WwKU/toml#L-35

Update: Well.. not just on that line.. basically any line with:

KEY = VAL # COMMENT
buhman commented 5 years ago

Yeah, inline comments break everything that depends on anchors. Line 76 for example. To properly fix that I'd need to push and pop a comment context; I'm probably not motivated enough seriously spelunk into making this robust..

kaushalmodi commented 5 years ago

Still, I appreciate the time and effort you put into this, inspite of not being a TOML user. Many thanks!

The current state of TOML highlighting is much better than none at all :D