numtide / treefmt

one CLI to format your repo [maintainers=@zimbatm,@brianmcgee]
https://treefmt.com
MIT License
590 stars 37 forks source link

Inconsistent: error: unexpected changes detected, --fail-on-change is enabled #342

Closed terlar closed 2 months ago

terlar commented 3 months ago

Describe the bug I just switched to the new release treefmt v2.0.2 and I am experiencing some weird issues with the --fail-on-change flag.

I am running treefmt via pre-commit (git-hooks.nix).

I first noticed the issue in CI which seems to consistently fail for a specific commit. Same commit passed locally. Then next commit that I did to debug the issue failed locally. Then I committed again and it passed. It also passed on CI. The CI is running within a dedicated docker container including pre-commit built with Nix. Same docker image with same hash running locally works every time for any commit I tried so far. However then I am mounting the code as a volume, so it might have something that affects the result.

Given the nature of the reproducibility I cannot provide exact steps. I will look further. Any ideas how to debug this?

terlar commented 3 months ago

Update, I just managed to reproduce it once locally in the docker container, but don't seem to be able to reproduce it again.

brianmcgee commented 3 months ago

@terlar trying upping the logging with -v or -vv, that might shed some light

terlar commented 3 months ago

Yeah, only when I enabled the logging it didn't fail, but I guess I will just have to leave it with logging on and keep running until I run into the issue again.

brianmcgee commented 3 months ago

Catching the failure and running git diff might shed some light too. If that shows nothing changed, then treefmt detects an incorrect change. It may be that a change is happening for some reason.

This is how we generally implement treefmt checks, as you can see here.

terlar commented 3 months ago

There are no changes, but I have a hunch. When I ran treefmt in an infinite loop, and pressed ctrl+c to forcefully cancel the infinite loop. Then I saw one of the formatters put temporary files in the tree. Could it be that they are somehow picked up if formatters are running in parallel. I have two formatters working on the same file.

This is just a hunch and no proof.

terlar commented 3 months ago

or actually seems neither treefmt standalone or via pre-commit will work on untracked files

brianmcgee commented 3 months ago

That's correct, v2 will not operate on untracked files when using the default git walker. If you use --walk filesystem it will just operate on whatever it finds.

terlar commented 3 months ago

Turns out one thing that prevented me from reproducing the error was that I also added --no-cache to rule out caching issues. But turns out there is a bug there, reported as a separate issue #343

terlar commented 3 months ago

Just managed to reproduce this with -v, no luck:

$ git status -sb                                                                                                                                                                                                                                                                                                     
## my-branch...origin/my-branch
M  flake.nix
$ git commit --amend --no-edit
deadnix..................................................................Passed
statix...................................................................Passed
treefmt..................................................................Failed
- hook id: treefmt
- exit code: 1

INFO format | dos2unix: 1 files processed in 1.137042ms
INFO format | nixfmt: 1 files processed in 12.342613ms
treefmt: error: unexpected changes detected, --fail-on-change is enabled

validate-gcs-tfbackend...............................(no files to check)Skipped
$ git commit --amend --no-edit
deadnix..................................................................Passed
statix...................................................................Passed
treefmt..................................................................Passed
validate-gcs-tfbackend...............................(no files to check)Skipped
conform enforce..........................................................Passed
[my-branch 8c50b00] ci: verbose treefmt output
 Date: Fri Jul 5 15:45:29 2024 +0200
 1 file changed, 1 insertion(+)
$ git status -sb

As seen by the output, there was no changes, another commit worked without any changes... or I guess I will have to double check the timestamp in case that is what happens, but both those formatters "should" have fixed that.

brianmcgee commented 3 months ago

I'll have a look at improving the logging to make it clearer what's going on.

terlar commented 3 months ago

Okay, I can now reproduce this. It happens every time on the first formatting of the change.

$ git add -Ap flake.nix
diff --git a/flake.nix b/flake.nix
index e72b06f..566f7c4 100644
--- a/flake.nix
+++ b/flake.nix
@@ -89,6 +89,7 @@
             };
           };

+          # test things
           pre-commit.settings.hooks = {
             treefmt.entry = "${config.pre-commit.settings.hooks.treefmt.package}/bin/treefmt -vv --fail-on-change";
             validate-gcs-tfbackend = {
(1/1) Stage this hunk [y,n,q,a,d,e,p,?]? y

$ git status -sb
## my-branch...origin/my-branch [ahead 1, behind 1]
M  flake.nix
$ ls -la flake.nix
-rw-r--r-- 1 user users 4220 Jul  5 16:27 flake.nix
$ git commit --amend --no-edit
deadnix..................................................................Passed
statix...................................................................Passed
treefmt..................................................................Failed
- hook id: treefmt
- exit code: 1

DEBU format: config-file=/nix/store/xgdgjxfn1jl99g7df7iqg9vngbasa80g-treefmt.toml tree-root=/home/user/src/repo
DEBU cache: finished generating change set in 578.713µs
DEBU format | dos2unix: match: /home/user/src/repo/flake.nix
DEBU format | nixfmt: match: /home/user/src/repo/flake.nix
DEBU format | dos2unix: executing: /nix/store/ly1fwwlcb6f1hkfr745qap38q744y3ld-dos2unix-7.5.2/bin/dos2unix --keepdate flake.nix
INFO format | dos2unix: 1 files processed in 1.081072ms
DEBU format | nixfmt: executing: /nix/store/y8662hcdvnnlpa0r04gsprhnwwv8hgam-nixfmt-unstable-2024-07-03/bin/nixfmt flake.nix
INFO format | nixfmt: 1 files processed in 12.176471ms
DEBU cache: finished processing 1 paths in 3.034879ms
treefmt: error: unexpected changes detected, --fail-on-change is enabled

validate-gcs-tfbackend...............................(no files to check)Skipped
$ ls -la flake.nix
-rw-r--r-- 1 user users 4220 Jul  5 16:27 flake.nix
$ git status -sb
## my-branch...origin/my-branch [ahead 1, behind 1]
M  flake.nix
$ git commit --amend --no-edit
deadnix..................................................................Passed
statix...................................................................Passed
treefmt..................................................................Passed
validate-gcs-tfbackend...............................(no files to check)Skipped
conform enforce..........................................................Passed
[my-branch ab6ae39] ci: verbose treefmt output
 Date: Fri Jul 5 15:45:29 2024 +0200
 1 file changed, 2 insertions(+)

Doesn't change the timestamp nor the file.

I will verify if this is related to multiple formatters or only one.

terlar commented 3 months ago

I can verify that this only happens when using two formatters, in my case dos2unix and nixfmt. Any idea what to make of this?

terlar commented 3 months ago

I tested this with treefmt v23e563b that includes with the fix for using --fail-on-change with --no-cache. I am surprised to see the same behavior as I thought perhaps the second pass was due to cache.

I also reproduce this with treefmt by itself to rule out any issues related to other parts.

$ cat bug-test.yaml
key1: value1
key2: value2
key3: value3
$ treefmt --no-cache --fail-on-change
traversed 302 files
emitted 302 files for processing
formatted 302 files (0 changed) in 297ms
$ touch bug-test.yaml
$ treefmt --no-cache --fail-on-change
treefmt: error: unexpected changes detected, --fail-on-change is enabled
$ treefmt --no-cache --fail-on-change
traversed 302 files
emitted 302 files for processing
formatted 302 files (0 changed) in 280ms

I will create an isolated flake with a reproducer.

terlar commented 3 months ago

Steps to reproduce:

  1. Checkout latest treefmt
  2. mkdir -p temp-test
  3. Create a file temp-test/dos2unix+yaml.yaml with contents:
    hello: world
    list:
    - one
    - two
    - three
  4. Create a file temp-test/treefmt.toml with contents:
    
    [formatter.dos2unix]
    command = "dos2unix"
    includes = ["*.yaml"]
    options = ["--keepdate"]

[formatter.yamlfmt] command = "yamlfmt" includes = ["*.yaml"]


5. Run command `nix shell nixpkgs#{dos2unix,yamlfmt} --command nix run . -- -C temp-test --no-cache --fail-on-change`
6. Observe failure
7. Run again, observe success
8. `touch temp-test/dos2unix+yaml.yaml`
9. Run again, observe failure
10. Run again, observe success
11. Run again, observe success
terlar commented 3 months ago

There are no changes to the file and checking the timestamp show there are no changes to modify time.

$ touch temp-test/dos2unix+yaml.yaml 
$ stat temp-test/dos2unix+yaml.yaml
  File: temp-test/dos2unix+yaml.yaml
  Size: 45          Blocks: 8          IO Block: 4096   regular file
Device: 0,31    Inode: 162192099   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/   terje)   Gid: (  100/   users)
Access: 2024-07-06 07:59:12.188926052 +0200
Modify: 2024-07-06 07:59:12.188926052 +0200
Change: 2024-07-06 07:59:12.188926052 +0200
 Birth: 2024-07-06 07:59:07.583913859 +0200
$ nix shell nixpkgs#{dos2unix,yamlfmt} --command nix run . -- -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
treefmt: error: unexpected changes detected, --fail-on-change is enabled
$ stat temp-test/dos2unix+yaml.yaml
  File: temp-test/dos2unix+yaml.yaml
  Size: 45          Blocks: 8          IO Block: 4096   regular file
Device: 0,31    Inode: 162205743   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/   terje)   Gid: (  100/   users)
Access: 2024-07-06 08:00:18.433113619 +0200
Modify: 2024-07-06 07:59:12.000000000 +0200
Change: 2024-07-06 08:00:18.429113607 +0200
 Birth: 2024-07-06 08:00:18.429113607 +0200
$ nix shell nixpkgs#{dos2unix,yamlfmt} --command nix run . -- -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 5ms
$ stat temp-test/dos2unix+yaml.yaml
  File: temp-test/dos2unix+yaml.yaml
  Size: 45          Blocks: 8          IO Block: 4096   regular file
Device: 0,31    Inode: 162219378   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/   terje)   Gid: (  100/   users)
Access: 2024-07-06 08:00:54.709224730 +0200
Modify: 2024-07-06 07:59:12.000000000 +0200
Change: 2024-07-06 08:00:54.705224718 +0200
 Birth: 2024-07-06 08:00:54.705224718 +0200

Also consecutive runs are passing after the initial failure.

brianmcgee commented 3 months ago

Thanks for the reproducer, I'll dig into this later today.

brianmcgee commented 3 months ago

@terlar if you look at the stat output, you can see that the mod time is actually changing. It's being replaced with a millisecond precision value afterwards.

From some experimentation, this seems to be due to dos2unix. I'm also seeing some inconsistent behaviour from yamlfmt.

I'm going to do a couple of things:

  1. Reduce the mod time comparison to second precision. This is in line with 1.x, and after speaking with @zimbatm, he pointed out that POSIX specifies EPOCH time for mod time.
  2. Improve the logging to make identifying the culprit easier if this occurs again. I'm thinking some flag that enables more robust checking before and after a formatter is applied to ensure it's behaving as expected.
brianmcgee commented 3 months ago

@terlar doing a little manual experimentation, I created a test file like above with:

hello: world
list:
  - one
  - two
  - three

I then ran the following with dos2unix:

❯ stat -c %Y test.yaml && \
    dos2unix -k test.yaml && \
    stat -c %Y test.yaml && \
    dos2unix -k test.yaml && \
    stat -c %Y test.yaml && \
    dos2unix -k test.yaml && \
    stat -c %Y test.yaml 
1720265173
dos2unix: converting file test.yaml to Unix format...
1720265173
dos2unix: converting file test.yaml to Unix format...
1720265173
dos2unix: converting file test.yaml to Unix format...
1720265173

As you can see, the second precision mod time isn't changing. When I run the same test with yamlfmt:

treefmt/temp-test on  fix/fail-on-change [$!+] via ❄️  impure (devshell-env) on ☁️   
❯ stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml 
1720266379
1720266383
1720266383
1720266383

treefmt/temp-test on  fix/fail-on-change [$!+] via ❄️  impure (devshell-env) on ☁️   
❯ stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml 
1720266383
1720266388
1720266388
1720266388

treefmt/temp-test on  fix/fail-on-change [$!+] via ❄️  impure (devshell-env) on ☁️   
❯ stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%Y test.yaml 
1720266388
1720266395
1720266395
1720266395

I ran this test several times without ever touching the test file after first creating it. Without fail, yamlfmt changes the mod time on the first execution and then leaves it alone on subsequent executions.

It appears it doesn't mess with the mod time on first execution on a file which hasn't changed, and then on second execution it changes the mod time, then leaves it along the rest of the time.

brianmcgee commented 3 months ago

Playing around with the format to make it a bit easier to read:

❯ stat --format=%y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%y test.yaml && \
    yamlfmt test.yaml && \
    stat --format=%y test.yaml 
2024-07-06 12:36:41.733215531 +0100
2024-07-06 12:36:58.758401183 +0100
2024-07-06 12:36:58.763401237 +0100
2024-07-06 12:36:58.767401281 +0100

I'm not sure what it's doing.

brianmcgee commented 3 months ago

I was being a bit dense with my test.

❯ stat --format=%Y test.yaml && \ 
∙     yamlfmt test.yaml && sleep 1 && \ 
∙     stat --format=%Y test.yaml && \ 
∙     yamlfmt test.yaml && sleep 1 && \ 
∙     stat --format=%Y test.yaml && \ 
∙     yamlfmt test.yaml && sleep 1 && \ 
∙     stat --format=%Y test.yaml 
1720266636
1720267333
1720267334
1720267335

Looks like yamlfmt is changing the mod time even when the file contents don't change.

brianmcgee commented 3 months ago

yamlfmt is not supposed to be writing to the file if there is no diff: https://github.com/google/yamlfmt/blob/dd8547dbe27f5572ad2cfd3a794fb9ad454c1b23/engine.go#L70-L76

brianmcgee commented 3 months ago

@terlar I checked what version of yamlfmt had been brought in to my devshell and it was old, 0.11.1.

After updating the flake lock, nixos-unstable now brings in 0.13.0 which doesn't seem to have this issue:

❯ stat --format=%Y test.yaml && \
    yamlfmt test.yaml && sleep 1 && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && sleep 1 && \
    stat --format=%Y test.yaml && \
    yamlfmt test.yaml && sleep 1 && \
    stat --format=%Y test.yaml 
1720267906
1720267906
1720267906
1720267906

I tested with your original test case above on the branch for #345, and it's now behaving as expected:

treefmt on  fix/fail-on-change [$!+] via 🐹 v1.22.3 via ❄️  impure (devshell-env) on ☁️   took 2s 
❯ nix run . -- -C temp-test --no-cache --fail-on-change 
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 4ms

treefmt on  fix/fail-on-change [$!+] via 🐹 v1.22.3 via ❄️  impure (devshell-env) on ☁️   
❯ nix run . -- -C temp-test --no-cache --fail-on-change 
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 4ms

treefmt on  fix/fail-on-change [$!+] via 🐹 v1.22.3 via ❄️  impure (devshell-env) on ☁️   
❯ nix run . -- -C temp-test --no-cache --fail-on-change 
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 4ms

treefmt on  fix/fail-on-change [$!+] via 🐹 v1.22.3 via ❄️  impure (devshell-env) on ☁️   
❯ nix run . -- -C temp-test --no-cache --fail-on-change 
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 4ms

TLDR

The problem is fixed by moving to second level mod time precision (#345) and upgrading your yamlfmt version to 0.13.0 (latest in nixos-unstable)

terlar commented 3 months ago

I was running the yamlfmt with the fix all the time. I still see this issue. Try touching the file inbetween. Then it fails again.

But yes, that might be treefmt. I will try your branch again. Next time I create a reproducer, I should add the explicit nixpkgs version. Sorry to waste your time with that detour.

$ nix shell github:numtide/treefmt github:nixos/nixpkgs/nixos-unstable#{dos2unix,yamlfmt} --command treefmt -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 5ms
$ touch temp-test/dos2unix+yaml.yaml
$ nix shell github:numtide/treefmt github:nixos/nixpkgs/nixos-unstable#{dos2unix,yamlfmt} --command treefmt -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
treefmt: error: unexpected changes detected, --fail-on-change is enabled
terlar commented 3 months ago

I tried with that branch and it still fails:

$ touch temp-test/dos2unix+yaml.yaml
$ nix shell github:numtide/treefmt/fix/fail-on-change github:nixos/nixpkgs/nixos-unstable#{dos2unix,yamlfmt} --command treefmt -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
treefmt: error: unexpected changes detected, --fail-on-change is enabled
$ nix shell github:numtide/treefmt/fix/fail-on-change github:nixos/nixpkgs/nixos-unstable#{dos2unix,yamlfmt} --command treefmt -C temp-test --no-cache --fail-on-change
WARN format: no formatter for path: treefmt.toml
traversed 2 files
emitted 2 files for processing
formatted 1 files (0 changed) in 5ms
brianmcgee commented 3 months ago

Ok, when I touch the file in between, I see the same failures as you are.

I'll have another look this evening.

terlar commented 3 months ago

Thank you, it is a tricky one, and not easy to pinpoint. I am not sure it is specific to my formatters, as another person also reported it with another set of formatters here: https://github.com/numtide/treefmt-nix/issues/156 (alejandra, deadnix, prettier, statix).

Seems to be something with multiple formatters touching the same files and weirdly it only fails first time despite not running with cache. So somehow feels like there is still some kind of cache.

Also once again sorry to waste your debug time with not providing a reproducer with explicit yamlfmt version :crying_cat_face:

brianmcgee commented 3 months ago

Ah, this just clicked for me.

In the file cache we record the mod time and the file size. We consider a file to have changed if they don't match.

By touching the file, when treefmt runs, the cache layer thinks the file has been changed and it is emitted for processing. The formatter formats it but doesn't change anything. The new size and mod time gets written to the cache.

Next time you run without touching the file everything is good.

I need to have a think about this one :thinking:

We use this strategy as a light weight change detection mechanism. @zimbatm has talked about configurable change detection strategies which can be turned on at an increased cost. This could be one of those scenarios.

brianmcgee commented 3 months ago

Ok, I don't think it's quite what I said above, but I think it's something in the caching/change detection.

I'm now running late cos I needed to find this :joy:

I'll have a look later.

montchr commented 3 months ago

I am also encountering this (see https://github.com/numtide/treefmt-nix/issues/156) but I am not using yamlfmt.

brianmcgee commented 3 months ago

Looking at this again with fresh eyes, I realized I should have truncated the mod time rather than rounding it.

@terlar I no longer see the issue with touching the files in between treefmt --no-cache --fail-on-change calls. Can you give it another go with the branch from #345 ?

@montchr I'm having trouble checking out the commit you mentioned to test.

terlar commented 3 months ago

@brianmcgee Thank you, seems to work now! :)

montchr commented 3 months ago

@brianmcgee sorry for the confusion, I only posted a comment in the other ticket, created by @zmrocze. I just meant to confirm that the issue wasn't related to yamlfmt.

Thank you all for working on a fix for this. It took me a while to realize it wasn't just a misconfiguration on my end. I'll test #345 in my project.