Open johbo opened 8 years ago
Can you give a way to reproduce? What derivation to build?
Just checked the pelican sources don't seem to have this issue anymore. I'll try to create a small derivation to reproduce the issue.
Did you edit the OP with the minimal example? If so note there are no notifications from that.
@johbo any luck with the repro? There's already a known issue with the default Darwin case-insensitive HFS+ filesystem, since any FO derivation that contains files with different cases will lose the "overlapping" files and then hash to something different.
I got back to it. Here is how I tried to reproduce it, maybe that helps to decide if there is a problem at all inside of Nix or if the issues sits somewhere else.
I've put sources into this repository: https://github.com/johbo/reproduce-nix-unicode-darwin
Basic idea is to use fetchurl
to get sources from a repository:
tarball = pkgs.fetchzip {
url = https://github.com/johbo/reproduce-nix-unicode-darwin/archive/9c7029ef3b9301c9faf55659ea281332f5f6a281.tar.gz;
sha256 = "1h7z2wax8ywhp0zr08qm78573rcd6nq3y8scl5pbv3lhpilf44sr";
};
The repository contains the file décembre
which is expected to trigger the issue. That's also a filename from the Pelican repository.
I've built things in the following way both on Darwin and on NixOS:
nix-build -A tarball
Last test was with these versions:
One thing I recall from screwing around on Darwin is that HFS+ always stores some normalized form (can't remember the details) of unicode characters, so if you enter your diacritics as combining characters they might get switched to the precomposed forms. Or something like that. We probably just need the hash function to be explicit about what it wants.
I marked this as stale due to inactivity. → More info
I closed this issue due to inactivity. → More info
I get different hashes on Darwin if non-ASCII filenames are included.
This is a way to reproduce the problem:
I see this result on Darwin:
And this result on NixOS:
My assumption is that this difference was also causing the issue that I got a different hash for Pelican on Darwin than on NixOS. I tracked the difference down to a file called
décembre
inside of the source tarball of Pelican.I guess that what we get back as the filename needs special treatment on darwin, so that we get consistent hashing. I am willing to try things out if someone has a hint for me where to start in the codebase.
Pointers: