numtide / treefmt

one CLI to format your repo [maintainers=@zimbatm,@brianmcgee]
https://treefmt.com
MIT License
575 stars 35 forks source link

Implement a more robust file modification detection logic #46

Open basile-henry opened 3 years ago

basile-henry commented 3 years ago

This part of the formatter spec is problematic:

If, and only if, a file format has changed, the formatter MUST write the new content in place of the original file.

My understanding is that many formatters simply don't check if the new formatted version of the file they have is different from the original file. We could ask upstream to change how the formatters handle this edge case, however I believe we should make prjfmt play as nicely as possible with how existing formatters already work.

I propose we change the file modification detection logic slightly. Instead of only relying on the mtime of a file, we should look at its checksum (md5, or anything that is fast really) as well. Since calculating checksums takes a bit more time than reading the mtime we can use the mtime as a first check and only if it has changed calculate the checksum of the file.

zimbatm commented 3 years ago

Do you find formatters that don't respect that part of the spec yet?

I would avoid adding a default performance hit. Ideally, prjfmt should be at least as fast, or faster than the upstream formatter for doing file traversal.

I think what we will end-up doing is adding flags for formatters that don't respect the spec. And enable checksum calculation as a fallback like you suggested.

basile-henry commented 3 years ago

Do you find formatters that don't respect that part of the spec yet?

I thought I remembered seeing this issue in ormolu but now that I am testing it again that doesn't seem to be the case. I've just tested all the formatters we have in examples/monorepo and they all seem to be following the spec. :+1:

Since I am wrong about this being an issue we can close this issue if you want. Unless you think we need to add an option for future formatters that don't respect that.

Should we automate checking if a formatter follows the spec somehow?

Edit: Maybe I was using an older version of ormolu. This feature of not over-writing a correctly formatted file is about 5 months old: https://github.com/tweag/ormolu/commit/e95ec9b83b6b15a203e798bee8fae961a505bc85

zimbatm commented 3 years ago

It's good to have that conversation. I am quite sure that we will be encountering weird implementations out there.

Should we automate checking if a formatter follows the spec somehow?

I like that idea! Maybe the formatter config can include:

[formatters.ormolu]
command = "ormolu" # side discussion: this could default to the formatter name
options = ["--mode=inplace"]
test_content = """
<put some haskell here>
"""

Then run a small test suite whenever the formatter config is missing from the manifest.

Rizary commented 3 years ago

So there is a discussion on rust's zulip community server that I should quote here:

Another big win would be using file hashes instead of mtime to determine file changes.

The "Big win" here is related to implement a functional package manager in Rust similar to what Nix and Guix does.

zimbatm commented 3 years ago

That's in the compiler context which is a bit different. When compiling you get a graph of dependencies to re-compile whenever a file changes. With formatting, each file is independent of the other.

zimbatm commented 3 years ago

Related read: https://apenwarr.ca/log/20181113

Mic92 commented 2 years ago

I worked around the current logic in CI by using cp with timestamps from 1970 when copying files. Otherwise some files changes are not detected. It's a race condition the old timestamp and the new timestamp not changing as they are too close together.

{ runCommandNoCC, git, linters, self }:
runCommandNoCC "check-format"
{
  nativeBuildInputs = linters;
} ''
  # keep timestamps so that treefmt is able to detect mtime changes
  cp --no-preserve=mode --preserve=timestamps -r ${self} source
  cd source
  HOME=$TMPDIR treefmt --fail-on-change
  touch $out
''