numtide / treefmt

one CLI to format your repo [maintainers=@zimbatm,@brianmcgee]
https://treefmt.com
MIT License
620 stars 38 forks source link

treefmt cache results in warnings getting suppressed #433

Closed jfly closed 1 month ago

jfly commented 1 month ago

Describe the bug

treefmt helpfully logs warnings when it doesn't know how to format a file:

$ treefmt test.py
WARN format: no formatter for path: test.py
traversed 1 files
emitted 1 files for processing
formatted 0 files (0 changed) in 21ms

However, if you try to format that exact same file again, you won't see the warning:

$ treefmt test.py
traversed 1 files
emitted 0 files for processing
formatted 0 files (0 changed) in 13ms

Note how the warning reappears if you invoke treefmt with --no-cache:

$ treefmt --no-cache test.py
WARN format: no formatter for path: test.py
traversed 1 files
emitted 1 files for processing
formatted 0 files (0 changed) in 3ms

To Reproduce

Run treefmt twice on the same file.

Expected behavior

I'd like to see the same set of warnings every time, regardless of the state of my treefmt cache.

System information

I'm using the the latest code on main (714cf33489c936d118d60dcbc066b79a53b8b271 at time of writing).

brianmcgee commented 1 month ago

I need to fish out the issue, but I remember this being discussed and someone arguing for the "warn once, but don't annoy me every time" version, which it currently does. It's a toss-up, really. I can see the value of both approaches.

I don't want to be ping-ponging between enabling and disabling this every so often.

brianmcgee commented 1 month ago

That being said, with #431, it is now easier to provide persistent config either in the treefmt.toml file or via the environment, so you can choose which behaviour you want and not have to give any extra flags each time.

Perhaps a cache-unmatched option is false by default, but can you silence it quickly enough? :thinking:

jfly commented 1 month ago

Speaking as a user of a piece of software, I am surprised if the existence of a cache affects anything other than the performance of the software. This is a situation where the state of the cache affects the actual output of the software, which smells odd to me.

I remember this being discussed and someone arguing for the "warn once, but don't annoy me every time" version, which it currently does

If people want a "warn me once for each new file" behavior, that feels like it should be implemented with state, not a cache. The current implementation doesn't even achieve the "warn me once" behavior, because it'll re-warn me about files every time I change those files (because it busts treefmt's cache).

But ignoring warnings, maybe we can agree that --on-unmatched=fatal should behave consistently regardless of cache? Right now, it doesn't.

With a warm cache, treefmt runs successfully:

./ $ treefmt --on-unmatched=fatal
traversed 3 files
emitted 0 files for processing
formatted 0 files (0 changed) in 14ms

./ $ echo $status
0

With no cache, it errors out about my (cached) .envrc file which it doesn't know how to format:

./ $ treefmt --on-unmatched=fatal --no-cache
traversed 3 files
emitted 3 files for processing
formatted 0 files (0 changed) in 1ms
Error: no formatter for path: .envrc

./ $ echo $status
1
jfly commented 1 month ago

That being said, with https://github.com/numtide/treefmt/pull/431, it is now easier to provide persistent config either in the treefmt.toml file or via the environment, so you can choose which behaviour you want and not have to give any extra flags each time.

I do agree that helps! My suggestion would be for treefmt to always warn about these things, and folks can turn the warnings off if they want.

brianmcgee commented 1 month ago

I agree about the bahviour being inconsistent.

With the recent changes it should be straightforward to always log, and the user can silence the warning by changing the level in their config or an env variable.

brianmcgee commented 1 month ago

There will be a performance hit, but how much, I'm not sure. This would require us to always apply the matching rules to every file rather than only matching on files that have changed. I'll play around with it later today and try it out on nixpkgs.

brianmcgee commented 1 month ago

I made this change in #447.

The performance impact of matching before checking the cache appears to be negligible when run against Nixpkgs:

# before

❯ nix run github:numtide/treefmt -- --config-file ./test/examples/nixpkgs.toml --tree-root /home/brian/Development/com/github/nixos/nixpkgs -u info -c 
traversed 43492 files
emitted 43492 files for processing
formatted 36059 files (21522 changed) in 21.107s

❯ nix run github:numtide/treefmt -- --config-file ./test/examples/nixpkgs.toml --tree-root /home/brian/Development/com/github/nixos/nixpkgs -u info  
traversed 43492 files
emitted 0 files for processing
formatted 0 files (0 changed) in 298ms

# after

❯ nix run .# -- --config-file ./test/examples/nixpkgs.toml --tree-root /home/brian/Development/com/github/nixos/nixpkgs -u info -c 
traversed 43492 files
emitted 43492 files for processing
formatted 36059 files (21522 changed) in 21.031s

# after with hot cache

❯ nix run .# -- --config-file ./test/examples/nixpkgs.toml --tree-root /home/brian/Development/com/github/nixos/nixpkgs -u info   
traversed 43492 files
emitted 0 files for processing
formatted 0 files (0 changed) in 282ms