Gabriella439 / nix-diff

Explain why two Nix derivations differ
BSD 3-Clause "New" or "Revised" License
340 stars 18 forks source link

Workflow for diagnosing cache miss with substituter? #56

Closed edrex closed 2 years ago

edrex commented 2 years ago

It would be great to have a general workflow for answering the "what's the diff" question between a local build and a nar in a remote cache. This is a support request, which maybe could result in a paragraph in the README.

I'm trying to diagnose a cache miss with a substituter populated by GH CI. Here's the run that's populating it, listing the store paths being uploaded:

https://github.com/srid/emanote/runs/5807751013

It seems like I need to get the closure of the remote build into the local store, but I haven't figured out how.

❯ nix copy --from https://srid.cachix.org /nix/store/myrxpghd461gq5v04hmprclg5plssc7h-emanote-0.5.5.6
error: path '/nix/store/0g8x4ji79albbsvwk193j3s3fkrqr0a4-optparse-applicative-0.16.1.0' is not valid

Maybe I have to manually copy the deps from the nixpkgs cache?

I suspect the source dir is the difference anyway:

❯ nix copy --from https://srid.cachix.org /nix/store/bh3hbis2b2dqycb3f0q6dmrch3xsaahn-emanote
❯ nix-diff /nix/store/rqxdshjj2bcw9k5zn7yzvxlcca177wpj-emanote /nix/store/bh3hbis2b2dqycb3f0q6dmrch3xsaahn-emanote
nix-diff: unknown-deriver: openBinaryFile: does not exist (No such file or directory)

Not sure what this means.

Guidance?

Gabriella439 commented 2 years ago

So if you see a specific Nix store path in a remote build that you intend to fetch using nix copy and the fetch fails then nix-diff cannot help with that. The thing that nix-diff can help with is if you try to reproduce the build locally and you get a derivation with an output that doesn't match a remote build product; then nix-diff can explain how the local build differs from the remote build by comparing their respective .drv files

edrex commented 2 years ago

Issue 0: The original issue was that the store object for the flake repo's source checkout was different in CI (and hence in the remote cache), causing a cache miss for the build, so that should be in nix-diff's wheelhouse.

Issue 1: Attempting to nix copy the build from the cache failed, since some element of the closure wasn't in the cache. I'm assuming this is because it was from the nixpkgs cache, and nix copy is too low level to use the configured substituters. Is there some workaround for this?

Issue 2: Since I could see the store paths of the source objects were different, I tried to copy just the source object, which succeeded, and to run nix-diff on the respective store objects, which failed with nix-diff: unknown-deriver: openBinaryFile: does not exist (No such file or directory). Any idea why/what that error means?

edrex commented 2 years ago

Issue 0 was resolved, by changing the CI steps, but I would love to understand/document 1 and 2 so I can reach for nix-diff next time I'm troubleshooting a cache miss.

edrex commented 2 years ago

Re issue 2, maybe I am missing the deriver for the source store object?

It occurs to me that if nix-diff supported using configured substituters for store paths, this could all be much easier. Is that achievable/desirable?

Gabriella439 commented 2 years ago

I think this would be easier to diagnose if you provided the exact /nix/store paths for each step of your description, because I'm still not following along with what /nix/store were present remotely/locally, and what /nix/store you expected to be present remotely

edrex commented 2 years ago

The original issue was resolved by changing the way it was building, so maybe the build was broken in some way.

I've moved on so I'll close this. Thanks for your responses.

Gabriella439 commented 2 years ago

You're welcome!