helix-editor / helix

A post-modern modal text editor.
https://helix-editor.com
Mozilla Public License 2.0
34.09k stars 2.52k forks source link

update `languages.toml` file using GitHub's `languages.yml` #5904

Open k12ish opened 1 year ago

k12ish commented 1 year ago

I opened #5347 a while back to add syntax highlighting for .bash_aliases files. This involved a one line change in the languages.toml file:

[[language]]
name = "bash"
scope = "source.bash"
injection-regex = "(shell|bash|zsh|sh)"
file-types = ["sh", "bash", "zsh", ".bash_login", ".bash_logout", ".bash_profile", ".bashrc", ".profile", ".zshenv", ".zlogin", ".zlogout", ".zprofile", ".zshrc", "APKBUILD", "PKGBUILD", "eclass", "ebuild", "bazelrc", ".bash_aliases"]
shebangs = ["sh", "bash", "dash", "zsh"]
roots = []
comment-token = "#"
language-server = { command = "bash-language-server", args = ["start"] }
indent = { tab-width = 2, unit = "  " }

Github maintains a languages.yml file which contains much of the same data:

Shell:
  type: programming
  color: "#89e051"
  aliases:
  - sh
  - shell-script
  - bash
  - zsh
  extensions:
  - ".sh"
  - ".bash"
  - ".bats"
  - ".cgi"
  - ".command"
  - ".env"
  - ".fcgi"
  - ".ksh"
  - ".sh.in"
  - ".tmux"
  - ".tool"
  - ".zsh"
  - ".zsh-theme"
  filenames:
  - ".bash_aliases"
  - ".bash_history"
  - ".bash_logout"
  - ".bash_profile"
  - ".bashrc"
  - ".cshrc"
  - ".env"
  - ".env.example"
  - ".flaskenv"
  - ".kshrc"
  - ".login"
  - ".profile"
  - ".zlogin"
  - ".zlogout"
  - ".zprofile"
  - ".zshenv"
  - ".zshrc"
  - 9fs
  - PKGBUILD
  - bash_aliases
  - bash_logout
  - bash_profile
  - bashrc
  - cshrc
  - gradlew
  - kshrc
  - login
  - man
  - profile
  - zlogin
  - zlogout
  - zprofile
  - zshenv
  - zshrc
  interpreters:
  - ash
  - bash
  - dash
  - ksh
  - mksh
  - pdksh
  - rc
  - sh
  - zsh
  tm_scope: source.shell
  ace_mode: sh
  codemirror_mode: shell
  codemirror_mime_type: text/x-sh
  language_id: 346

Rather than incrementally improving language.toml files, perhaps we could import the relevant data from Github itself?

archseer commented 1 year ago

This is a great idea but it might be easiest to do manually for now. If we wrote a script I guess we'd want to match up languages by comparing the tm_scope with our scope field (it's possible names don't fully match) and update the toml structure.