lewis6991 / impatient.nvim

Improve startup time for Neovim
MIT License
1.19k stars 28 forks source link

Using `mtime.secs` for hash is incompatible with Nix managed configurations #42

Closed Steven0351 closed 1 year ago

Steven0351 commented 2 years ago

I've migrated to using Nix to manage my dotfiles and system configuration which which sets all file timestamps to 0. After migrating my fledgling neovim configuration over, I noticed that none of my changes were taking after updating my configs unless I deleted ~/.cache/nvim/luacache. This was not an issue before I migrated my configuration over to nix. After digging into the source, I see the hash function is as follows:

local function hash(modpath)
  local stat = fs_stat(modpath)
  if stat then
    return stat.mtime.sec
  end
end

For this to be more robust an actual file hash would need to be performed, though I imagine that would also slow things down. However, if the goal is for this plugin to eventually be merged into neovim core I think this would impact every Nix neovim user if it were merged with the hash function as-is.

lewis6991 commented 2 years ago

Sorry I don't understand why using the modified time for cache invalidation is wrong or why nix has issues with it? Can you elaborate more?

Using a full file hash would probably nullify any benefits this plugin gives since it would require loading all the files it is caching in order to create a hash. There would just little point.

Is the problem that nix has issues with libuv or something? And if so shouldn't that be fixed?

lewis6991 commented 2 years ago

Python has (and solved) this already in PEP552

The current Python pyc format is the marshaled code object of the module prefixed by a magic number, the source timestamp, and the source file size. The presence of a source timestamp means that a pyc is not a deterministic function of the input file’s contents—it also depends on volatile metadata, the mtime of the source. Thus, pycs are a barrier to proper reproducibility.

Distributors of Python code are currently stuck with the options of

  1. not distributing pycs and losing the caching advantages
  2. distributing pycs and losing reproducibility carefully giving all Python source files a deterministic timestamp (see, for example, https://github.com/python/cpython/pull/296)
  3. doing a complicated mixture of 1. and 2. like generating pycs at installation time

None of these options are very attractive. This PEP proposes allowing the timestamp to be replaced with a deterministic hash. The current timestamp invalidation method will remain the default, though. Despite its nondeterminism, timestamp invalidation works well for many workflows and usecases. The hash-based pyc format can impose the cost of reading and hashing every source file, which is more expensive than simply checking timestamps. Thus, for now, we expect it to be used mainly by distributors and power use cases.

Steven0351 commented 2 years ago

Is the problem that nix has issues with libuv or something? And if so shouldn't that be fixed?

It's not really anything to do with libuv specifically, just that any time-stamp based approach for caching anything being managed by nix is always going to be a cache-hit.

Here's an example from my system:

❯ ll ~/.config/nvim/init.lua
lrwxr-xr-x 84 stevensherry 19 Dec 14:54 /Users/stevensherry/.config/nvim/init.lua -> /nix/store/j0c14n9i00df7cf815lpmxspjxf4qv07-home-manager-files/.config/nvim/init.lua

That looks fine, but that timestamp is for the symlink. Here is the timestamp for the final file (in my case this is symlink -> symlink -> concrete file)

❯ ll /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua
.r--r--r-- 523 root 31 Dec  1969 /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua

The SHA256 hash is the same:

❯ openssl dgst -sha256 ~/.config/nvim/init.lua
SHA256(/Users/stevensherry/.config/nvim/init.lua)= 
c341e0a9a52d2a4802a5e179b93241c56bb913d80bcbcc3f5c4d5dc8db14cab8

❯ openssl dgst -sha256 /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua
SHA256(/nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua)= 
c341e0a9a52d2a4802a5e179b93241c56bb913d80bcbcc3f5c4d5dc8db14cab8

The problem becomes when I do an update, the impatient cache will not invalidate because Nix sets all timestamps in /nix/store to 0 (this is a feature not a bug).

I guess this is a long-winded way of trying to figure out if this is just meant to be good enough for the majority use-case. If so, I respect that decision and feel free to close the issue if you see no value in this 👍.

lewis6991 commented 2 years ago

We can take the same approach python has, where the current solution of using mtime will work well for 95% of users and thus should be the default.

If you want to put forward a PR that allows an alternative way of validating the cache then I'm happy to review it. However, note that Neovim currently doesn't have any hashing functions available, so one would need to be added somehow; either implemented from scratch or imported from a third party library.

azuwis commented 2 years ago

Meet the same problem, and use:

  home.activation.neovim = lib.hm.dag.entryAfter [ "writeBoundary" ] ''
    rm ~/.cache/nvim/luacache_chunks ~/.cache/nvim/luacache_modpaths
  '';

as a workaround.

lewis6991 commented 2 years ago

I've opened #50, so the hash uses the file size too. It doesn't completely fix this issue but it should mitigate it a bit since the cache will work, though with a much higher chance of a false-positive cache hit.

lewis6991 commented 1 year ago

Closing this.

The hashing is now pretty similar to what pycache does, if that isn't good enough for nix, then that's nix's problem.