sharkdp / bat

A cat(1) clone with wings.
Apache License 2.0
48.65k stars 1.23k forks source link

Share sublime syntaxes #919

Open Keats opened 4 years ago

Keats commented 4 years ago

Bat and Zola both use syntect for highlighting syntaxes and have their own folder for additional syntaxes on top of the official Sublime repo. It would be a good idea to merge so there is one known source of syntaxes that all of syntect users can contribute to.

The idea would be:

A bonus point for syntect is to have a Rust CI setup on that repo that tests every syntax for both rust-onig and fancy-regex, which can be used to report potential bugs/issues.

keith-hall commented 4 years ago

I've seen that quite often, tmLanguage syntax grammars are manually converted to sublime-syntax format for use in syntect. Have you considered adding support for tmLanguage files directly in syntect, it seems it could save some hassle? They essentially support a subset of sublime-syntax functionality, so the existing data structures can be used, it would just be a case of parsing the plist/xml format...

sharkdp commented 4 years ago

@Keats Thank you for your suggestion. As initially discussed in https://github.com/trishume/syntect/issues/287, that sounds like a good idea in principle.

Before we go forward with this, I'd first like to think about the possible downsides as well. I'm only interested in following this route, if it really leads to less maintenance work overall. There are also a few questions/problems that would need to be sorted out.

Pros:

Cons (some of them are specific to bat):

Potential issues:

In summary, I would currently actually tend towards keeping things as they are. Maintaining bat is a large amount of work already and I see a certain chance that this change would increase the amount of maintenance work. If someone can convince me otherwise, I'm happy to consider implementing this.

sharkdp commented 4 years ago

Related discussion on the syntect repo: https://github.com/trishume/syntect/issues/168

pksunkara commented 3 years ago

Pasting my comment from a different issue:

I think the whole community would benefit if you can extract out the syntax highlighting data and maintain it separately (maybe a git submodule?). I haven't see anyone else in the rust ecosystem do it as good as you and they all have issues like this (it works in bat). Let's try to reduce duplication work and keep the efforts in one place.

I understand the concerns about git submodule. I am not knowledgable about syntect, but I would like to enquire about the feasibility of doing the syntax data as a crate? That way, we can keep it in this repo (using cargo workspaces).

epage commented 1 year ago

Keats originally opened this up about having a central dumping ground for syntaxes which had some downsides (bloat, worse development experience for bat). We've seen though that there are cases where tools are fine "just doing whatever bat does" (broot, delta). I'm yet another person who wants to copy bat in my own tool, git-dive, a potential replacement for git-blame (so maybe https://github.com/sharkdp/bat/issues/1536 won't be needed if this works out?).

delta is taking the approach of depending on bat. That requires pulling in a lot of stuff that I won't be using. The alternative I wanted to propose was one or multiple of

These crates would reside in the bat repo and be part of a bat workspace. I know managing multiple crates adds extra overhead. cargo-release has been a big help for me and I would be willing to take input for what, if anything, would be needed for bat to use it. Of course, there are also several other tools in that space.

There is still changelog management (I'm hopeful [cargo-changelog])https://crates.io/crates/cargo-changelog) will help) and selecting versions. For versions, I use conventional commit style so I'm less likely to miss breaking changes when choosing my version but I would even be fine if these crates bumped their major version on every release.

sharkdp commented 1 year ago

@epage Thank you for your input. I'm willing to consider this now that I have some experience with cargo workspaces from another project.

The alternative I wanted to propose was one or multiple of

* A syntax and a theme asset crate

* A `HighlightingAssets ` crate(s) (unsure if syntax and theme should be split out into separate crates)

  * Digging into this code, it feels there is a lot of this is generally useful

I'm not sure I understand what the difference is? What would the first "syntax and theme asset crate" be? A very low-level crate that basically just includes the binary assets and one or two functions to get access to them (in a &[u8] sense)?

While that would allow even more flexibility, I'd still lean towards keeping the number of crates minimal. If we would have a proper cargo workspace though, we could also cleanly separate the bat crate into a library and the CLI.

epage commented 1 year ago

Yes, the crates would be

The reason I was thinking multiple crates

CosmicHorrorDev commented 1 year ago

I'm seemingly in the same position as @epage where I just want to do whatever bat does in terms of providing extra syntaxes (and to a lesser extent themes) without having to pull in all of bat as a lib

I'd be up for putting in the work to either provide it as a separate crate(s) in a workspace here, or just providing it as a separate crate on crates.io if it's deemed too much effort to maintain here

CosmicHorrorDev commented 1 year ago

I went ahead and threw together a quick crate that roughly has the core functionality that I would want. All it currently provides:

Some misc internal changes include switching the create.sh shell file that bat uses to a cargo-xtask task along with changing the license detection logic from matching on snippets to askalono (The crate that powers cargo-deny's license detection logic)

This doesn't add the asset loader logic that bat currently uses, but that should be a pretty simple addition (I just don't need it for my purposes currently)


Re:

I'm only interested in following this route, if it really leads to less maintenance work overall.

It would be pretty easy to switch things over and integrate some variation of my crate into bat, but I'm pretty doubtful that it will reduce the maintenance work. I'd be happy to just provide the crate separately and quietly keep things up to date with bat's assets in the background