Clang-format entire codebase

tophyr commented 1 week ago

Pull Request Type

[ ] GitHub Workflow changes
[ ] Documentation or Wiki changes
[ ] Build and Dependency changes
[ ] Runtime changes
- [ ] Render changes
- [ ] Audio changes
- [ ] Input changes
- [ ] Network changes
- [x] Other changes

Description

Adds lint scripts that run clang-format, and formats the codebase.

Related Issues

Screenshots (if applicable)

Checklist

[x] I have tested my changes locally and verified that they work as intended.
[x] I have documented any new or modified functionality.
[x] I have reviewed the changes to ensure they do not introduce any unnecessary complexity or duplicate code.
[x] I understand that by submitting this pull request, I am agreeing to license my contributions under the project's license.

Additional Comments

Am curious why the existing .github/workflows/clang-format-check.yml job isn't yelling about things. Want to understand that before merging this; this will be useless and just noisy unless CI enforces clang-format going forward.

Lgt2x commented 1 week ago

I really dislike the pre-commit hook idea. When I do a quick draft PR/commit, I don't want to be bothered by git preventing me from committing because the formatting is wrong. There are extensive articles about pre-commit hooks slowing down the development process. It also creates a dependency to clang-format for all developers. Imagine you just want to update some doc, even though you don't edit the code at all.

Automatic formatting check on CI has been disabled because it was causing problems, I don't remember the exact reason, but you can probably look back into past PRs for it.

For me scripts we can run before releases that don't bother developers are a possible solution (but please don't put them in project root directly)

Jayman2000 commented 1 week ago

I really dislike the pre-commit hook idea. When I do a quick draft PR/commit, I don't want to be bothered by git preventing me from committing because the formatting is wrong.

What if we told developers to run pre-commit run clang-format instead of pre-commit install? What if we used stages: [manual]?

There are extensive articles about pre-commit hooks slowing down the development process.

Could you link to some of them? I use pre-commit on pretty much all of my projects, so I’m interested if it’s causing me problems that I’m not aware of.

It also creates a dependency to clang-format for all developers. Imagine you just want to update some doc, even though you don't edit the code at all.

I disagree. I don’t think that this would create a dependency on clang-format for all developers. For one thing, pre-commit will automatically install clang-format for you if it’s not installed:

We built pre-commit to solve our hook issues. It is a multi-language package manager for pre-commit hooks. You specify a list of hooks you want and pre-commit manages the installation and execution of any hook written in any language before every commit. pre-commit is specifically designed to not require root access. If one of your developers doesn’t have node installed but modifies a JavaScript file, pre-commit automatically handles downloading and building node to run eslint without root.

Additionally, developers don’t have to run pre-commit if they don’t want to. It would be more accurate to say “It also creates a dependency on the pre-commit command for developers who want to use pre-commit.”

tophyr commented 1 week ago

pre-commit looks like an interesting concept. My guess is that @Lgt2x is talking strictly about git pre-commit hooks, which don't have any of the fancy pre-commit tool capabilities. I have all three of Windows, Mac and Linux dev envs - I'll try the pre-commit tool and see how easy it is to "bootstrap".

The primary objection I have to automatically formatting during a precommit is that I've seen situations where I explicitly want something formatted a certain way, and having that get changed underneath my feet (as opposed to failing a CI check, or giving me a warning about it pre/post commit) would piss me off.

@Jayman2000 would a commit-time warning, with a manually-invoked script, satisfy you? fwiw, i also get super pissed off at work when linters fail my diffs in CI - so having a local warning would be, i think, the best of both worlds.

tophyr commented 1 week ago

I see the commits that add the lint scripts, but I don’t see a commit that formats the code base.

@Jayman2000 good catch, looks like in my various rebases to get this onto upstream tip I actually didn't re-run the format diff.

Jayman2000 commented 1 week ago

@Jayman2000 would a commit-time warning, with a manually-invoked script, satisfy you? fwiw, i also get super pissed off at work when linters fail my diffs in CI - so having a local warning would be, i think, the best of both worlds.

Yeah, that would satisfy me. That being said, I don’t even want it to be triggered before a commit happens. I would be fine if it was triggered before a commit happens, but I wasn’t trying to make that suggestion. I was only suggesting pre-commit because I thought that a 6-line YAML file looked a lot simpler than a 56-line Batch script and a 74-line shell script.

Here’s one way that the pre-commit tool could be used without it actually being triggered by git commit:

diff --git a/tools/.pre-commit-config.yaml b/tools/.pre-commit-config.yaml
new file mode 100644
index 00000000..37d3cd34
--- /dev/null
+++ b/tools/.pre-commit-config.yaml
@@ -0,0 +1,8 @@
+repos:
+  -
+    repo: https://github.com/pre-commit/mirrors-clang-format
+    rev: v18.1.6
+    hooks:
+      -
+        id: clang-format
+        stages: [manual]
diff --git a/tools/lint.bat b/tools/lint.bat
new file mode 100644
index 00000000..cd8ac880
--- /dev/null
+++ b/tools/lint.bat
@@ -0,0 +1,4 @@
+@echo off
+
+for /f "delims=" %%i in ('git rev-parse --show-toplevel') do set WORKTREE_ROOT=%%i
+pre-commit run --config "%WORKTREE_ROOT%\tools\.pre-commit-config.yaml" --hook-stage manual --all
diff --git a/tools/lint.sh b/tools/lint.sh
new file mode 100644
index 00000000..d7009fd8
--- /dev/null
+++ b/tools/lint.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+WORKTREE_ROOT="$(git rev-parse --show-toplevel)"
+exec pre-commit run --config "$WORKTREE_ROOT/tools/.pre-commit-config.yaml" --hook-stage manual --all

tophyr commented 1 week ago

I was only suggesting pre-commit because I thought that a 6-line YAML file looked a lot simpler than a 56-line Batch script and a 74-line shell script.

Ahh, gotcha! Ok cool, I'll check it out. The scripts are kind of hilariously onerous, for what they do (just invoke clang-format on all files). The complexity is really just in making them parallelize the execution. Takes the formatting job from ~15 seconds to ~2. ¯\(ツ)/¯

Lgt2x commented 1 week ago

What if we told developers to run pre-commit run clang-format instead of pre-commit install? What if we used stages: [manual]?

There are extensive articles about pre-commit hooks slowing down the development process.

Could you link to some of them? I use pre-commit on pretty much all of my projects, so I’m interested if it’s causing me problems that I’m not aware of.

[1] [2] Are you using in distributed open-source teams like ours or only for your solo projects? Even for closed teams, I've never been a fan, mostly for the reasons mentioned here.

We built pre-commit to solve our hook issues. It is a multi-language package manager for pre-commit hooks. You specify a list of hooks you want and pre-commit manages the installation and execution of any hook written in any language before every commit. pre-commit is specifically designed to not require root access. If one of your developers doesn’t have node installed but modifies a JavaScript file, pre-commit automatically handles downloading and building node to run eslint without root.

Additionally, developers don’t have to run pre-commit if they don’t want to. It would be more accurate to say “It also creates a dependency on the pre-commit command for developers who want to use pre-commit.”

Ok there is our misunderstanding. I thought pre-commit hooks were mandatory to commit anything to the repo, but that does not seem to be the case. As mentionned, I thought about git hooks, not your pre-commit solution which is a different thing

Jayman2000 commented 1 week ago

[1] [2] Are you using in distributed open-source teams like ours or only for your solo projects? Even for closed teams, I've never been a fan, mostly for the reasons mentioned here.

I mainly use it for personal projects. All of my personal projects are open-source so technically anyone can contribute, but not a lot of people use my projects so I get very few contributions. I think that I’ve also encountered pre-commit when contributing to other open-source projects, but I can’t remember which ones.

tophyr commented 4 days ago

@Jayman2000 I've tried it out, and I have mixed feelings on the pre-commit tool. On one hand, it seems very nice and convenient - but on the other, I don't necessarily love the idea of requiring a pip install and I'm getting strong "npm + left-pad" vibes... it seems like a whole lot of infrastructure to rely on just to basically run find . -name "*.cpp" -exec clang-format -i {} \+.

(That last part is definitely unfair and reductive; pre-commit can clearly do a lot more than that. But on the other hand, we're not currently using anything else from it.)

Even my lint scripts here are kind of massively-overengineered, really. They're both a whole bunch of complexity in order to run clang-format optimally parallelized, when realistically the job completes serially in under a minute.

Thoughts?

Jayman2000 commented 2 days ago

@Jayman2000 I've tried it out, and I have mixed feelings on the pre-commit tool. On one hand, it seems very nice and convenient - but on the other, I don't necessarily love the idea of requiring a pip install and I'm getting strong "npm + left-pad" vibes... it seems like a whole lot of infrastructure to rely on just to basically run find . -name "*.cpp" -exec clang-format -i {} \+.

(That last part is definitely unfair and reductive; pre-commit can clearly do a lot more than that. But on the other hand, we're not currently using anything else from it.)

Even my lint scripts here are kind of massively-overengineered, really. They're both a whole bunch of complexity in order to run clang-format optimally parallelized, when realistically the job completes serially in under a minute.

Thoughts?

When I first read your post, my thought was “Yeah, pre-commit is probably not the best tool to use in this situation. It would probably be better to just write a new, simpler clang-format script.” I created an alternative PR that contains a simpler script, but as I was finishing the commits in that PR’s branch, I realized something. Having a script that runs clang-format on everything is probably the wrong way to go about doing things.

For one thing, I’m assuming that the goal isn’t just to clang-format the codebase one time. I’m assuming that the goal is to keep the codebase properly formatted, so I updated and re-enabled the clang-format Check workflow. Here’s where I ran into a problem. That workflow uses jidicula/clang-format-action. The latest version of jidicula/clang-format-action (v4.13.0) uses version 18.1.3 of clang-format. At the moment, the version of Linux that I use (NixOS 24.05) ships clang-format 18.1.7. When I run the script, it formats everything successfully, but when when I push my changes, the GitHub Action complains that the code isn’t formatted correctly. clang-format 18.1.3 and clang-format 18.1.7 produce different results.

I can think of a few potential solutions to this problem:

Turn the script into a #!/usr/bin/env nix-shell script that uses a shell.nix file. That way we could know for sure that the user is using clang-format 18.1.3. This is basically what I did in order to finish that alternative PR, but I took a long time to wait for clang to compile. Also, Nix doesn’t really work on Windows yet.
Instead of having a workflow that runs clang-format, have a workflow that runs pre-commit. That way, the version of clang-format will always be the same regardless of whether its being run locally or run in CI.
Instead of running clang-format directly, tell users to run the clang-format workflow locally via Act. This has two disadvantages. For one thing, it would be much slower than running clang-format normally because jidicula/clang-format-action is really inefficient. Additionally, it’s kind of a hack. You’re not supposed to run GitHub Actions locally because GitHub Actions are supposed to create vendor lock-in. Instead of trying to workaround the vendor lock-in, I think that it would be better to choose something like pre-commit that doesn’t try to create vendor lock-in in the first place.
Instead of running clang-format directly, write a script that uses the same Docker container that jidicula/clang-format-action uses.
Continue to have a script that runs whatever version of clang-format users happen to have on their path. If the clang-format Check workflow fails on a PR, have contributors manually apply the formatting changes.

I'm getting strong "npm + left-pad" vibes...

I agree. pre-commit is effectively a platform-neutral CI system, and every CI system that I’ve ever encountered gives me npm + left-pad vibes. If you know of any CI systems that are different, the I would be interested in learning more about them.

Lgt2x commented 8 hours ago

Hi @Jayman2000 , thanks for taking the time to investigate and report possible solutions to the formatting issue. As you figured out, Github actions for formatting only tell us whether the code is properly formatted or not, and will unfortunately not format the code for you. Enforcing this will be an additional constraint for contributors, that need to install clang-format themselves and figure out a way to make clang-format happy.

Let's keep in mind that we're not Google/Facebook with 150 engineers working full time on the project, but a small team of dedicated hobbyists fond of the game, and attracting new contributors by lowering the barrier to entry should be a higher priority than keeping the code perfectly formatted at all time. Forcing potential contributors to install dedicated tools just for formatting code is not a good solution whatsoever, let alone formater version conflicts. The only case where enforcing clang-format for contributions is acceptable is when you have an automated tool that can run on the PR to reformat and push fixes automatically to your branch (see for example the VTK gitlab bot). When it's not automated, it's just a hassle for devs.

I suggest we stop wasting time on an issue as minor as this one. We'll just run clang-format manually before each release on the whole codebase, that'll be good enough for the scale of this project.

DescentDevelopers / Descent3