Closed mschwaig closed 3 years ago
Hey, I'm not quite sure yet why the six dependency gets lost. (I will have a deeper look). But there is a simple workaround you can do. Just add six >= 1.9.0
to your requirements and add it as a dependency to google auth via override:
mach-nix.mkPython {
...
_.google-auth.propagatedBuildInputs.mod = pySelf: _: oldVal: oldVal ++ [ pySelf.six ];
}
Oh, wow. That's a great snippet and it did solve that problem, thanks.
I'm now stuck at two illegal instruction errors, but I did not have the chance to look at the core dump yet and I'm not sure whats going on. It could be this issue or something similar, where the python binary would need some environment variable to be set to select the appropriate codepath.
Please feel free to close this issue if you feel like mach-nix's issue tracker is not an appropriate place to dig deeper into this, I would totally understand. Thanks!
mschwaig@mschwaig-lamb:~/tensorflow-hello-world-nix-flake$ nix build --keep-going
warning: Git tree '/home/mschwaig/tensorflow-hello-world-nix-flake' is dirty
warning: unknown setting 'experimental-features'
builder for '/nix/store/3ndjr70zf64m26f4aa21v20sgpfbk0jd-python3.8-h5py-3.1.0.drv' failed with exit code 132; last 10 log lines:
creating build/lib.linux-aarch64-3.8/h5py/tests/test_vds
copying h5py/tests/test_vds/test_highlevel_vds.py -> build/lib.linux-aarch64-3.8/h5py/tests/test_vds
copying h5py/tests/test_vds/test_virtual_source.py -> build/lib.linux-aarch64-3.8/h5py/tests/test_vds
copying h5py/tests/test_vds/__init__.py -> build/lib.linux-aarch64-3.8/h5py/tests/test_vds
copying h5py/tests/test_vds/test_lowlevel_vds.py -> build/lib.linux-aarch64-3.8/h5py/tests/test_vds
copying h5py/tests/data_files/vlen_string_s390x.h5 -> build/lib.linux-aarch64-3.8/h5py/tests/data_files
copying h5py/tests/data_files/vlen_string_dset.h5 -> build/lib.linux-aarch64-3.8/h5py/tests/data_files
copying h5py/tests/data_files/vlen_string_dset_utc.h5 -> build/lib.linux-aarch64-3.8/h5py/tests/data_files
running build_ext
/nix/store/ic98d5gbvxhjklifqg9gqnan3h1hkw2r-setuptools-setup-hook/nix-support/setup-hook: line 17: 22 Illegal instruction (core dumped) /nix/store/a82rn0d51xyr47zad9abp0dihblzb9gk-python3-3.8.7/bin/python3.8 nix_run_setup bdist_wheel
builder for '/nix/store/gnv8z1q5zbnv758q0bckzq6myxq174kk-python3.8-scipy-1.6.0.drv' failed with exit code 132; last 10 log lines:
unpacking source archive /nix/store/7zg9gv304kbvga11dm3a66m1ik40iclk-scipy-1.6.0.tar.gz
source root is scipy-1.6.0
setting SOURCE_DATE_EPOCH to timestamp 1609369918 of file scipy-1.6.0/PKG-INFO
patching sources
updateAutotoolsGnuConfigScriptsPhase
configuring
no configure script, doing nothing
building
Executing setuptoolsBuildPhase
/nix/store/ic98d5gbvxhjklifqg9gqnan3h1hkw2r-setuptools-setup-hook/nix-support/setup-hook: line 17: 24 Illegal instruction (core dumped) /nix/store/a82rn0d51xyr47zad9abp0dihblzb9gk-python3-3.8.7/bin/python3.8 nix_run_setup build_ext --fcompiler='gnu95' bdist_wheel
cannot build derivation '/nix/store/rzrrqm7yybhhv437543d6vhf1b60yafh-python3.8-Keras_Preprocessing-1.1.2.drv': 1 dependencies couldn't be built
builder for '/nix/store/hz5y61x87d23b41jrs6i74jrws50pdyc-python3.8-tensorflow-tensorboard-2.4.0.drv' failed with exit code 132; last 10 log lines:
Rewriting #!/nix/store/a82rn0d51xyr47zad9abp0dihblzb9gk-python3-3.8.7/bin/python3.8 to #!/nix/store/a82rn0d51xyr47zad9abp0dihblzb9gk-python3-3.8.7
wrapping `/nix/store/6f3knyk4zax74bwwf1kwzm2lfr384knw-python3.8-tensorflow-tensorboard-2.4.0/bin/tensorboard'...
Executing pythonRemoveTestsDir
Finished executing pythonRemoveTestsDir
pythonCatchConflictsPhase
pythonRemoveBinBytecodePhase
pythonImportsCheckPhase
Executing pythonImportsCheckPhase
Check whether the following modules can be imported: tensorboard tensorboard.backend tensorboard.compat tensorboard.data tensorboard.plugins tensorboard.summary tensorboard.util
/nix/store/4w6dxpvsgip2djk7b6s28xjqkhpk14s1-python-imports-check-hook.sh/nix-support/setup-hook: line 9: 710 Illegal instruction (core dumped) /nix/store/a82rn0d51xyr47zad9abp0dihblzb9gk-python3-3.8.7/bin/python3.8 -c 'import os; import importlib; list(map(lambda mod: importlib.import_module(mod), os.environ["pythonImportsCheck"].split()))'
cannot build derivation '/nix/store/13q1dd1ikahphsfg6dyz69zajv802ryj-python3-3.8.7-env.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/l8gvlk3pk3d3qf972wczwli868nrrgr7-tensorflow-2.4.0.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/jjf6ajgzi6vbapmjiljpqr8yhpl8nvd7-python3.8-tensorflow-2.4.0.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/8kk2yxam2c8xl6agc4xjyzaxwp23m6d0-python3.8-tensorflow-hello-world-0.1.0.drv': 1 dependencies couldn't be built
error: --- Error ------------------------------------------------------------------------------------- nix
build of '/nix/store/8kk2yxam2c8xl6agc4xjyzaxwp23m6d0-python3.8-tensorflow-hello-world-0.1.0.drv' failed
I'd like mach-nix to have good support for aarch64, therefore I'm definitely interested in fixing this. But I'm not sure haw much time I can allocate for this. If you could dig a bit deeper, I would appreciate it. Does the the tensorflow package from nixpkgs (without using mach-nix) work on aarch64?
I just found the original problem. google-auth depends on six and the google-auth package in nixpkgs doesn't declare a dependency on six. On hydra, it builds, because tests are enabled and another sub-dependency declares six as a checkInput. But mach-nix disables tests by default and therefore six doesn't end up in the build environment.
This should be fixed in nixpkgs, but we could also consider to improve mach-nix to fix such mistakes automatically.
I'd like mach-nix to have good support for aarch64, therefore I'm definitely interested in fixing this. But I'm not sure haw much time I can allocate for this. If you could dig a bit deeper, I would appreciate it. Does the the tensorflow package from nixpkgs (without using mach-nix) work on aarch64?
I don't know I'll have to try it or ask on IRC.
EDIT: I want to get this working on aarch64, so I can look into it, but I cannot do it right away.
I just found the original problem. google-auth depends on six and the google-auth package in nixpkgs doesn't declare a dependency on six. On hydra, it builds, because tests are enabled and another sub-dependency declares six as a checkInput. But mach-nix disables tests by default and therefore six doesn't end up in the build environment.
This should be fixed in nixpkgs, but we could also consider to improve mach-nix to fix such mistakes automatically.
I think it's great to automatically find these kind of discrepancies so that they can be fixed, but an automated fix that just makes the problem go away entirely sounds like it could keep those issues buried and create a discrepancy between the written requirements and what actually happens.
I really like mach-nix and the examples.md it provides and that got me very far without being an expert in neither Nix nor Python. I will go a bit into how I looked into the first problem here in case it helps as a user's perspective.
This is the first set of issues that I run into where I did not know how to approach the problem on my own. Maybe I'm missing knowledge about how to inspect the dependency tree that mach-nix constructs. I did for example look through the relevant derivations and files in the store and I did look through pypi-deps-db searching for the relevant dependencies manually as well, but that did not help me figure out how things should fit together. From reading your comments on another issues I have the feeling I maybe should have used nix repl
.
Thanks for your PR. Why is it a draft?
Debugging dependency trees in nix ins't an easy thing. The nix cmdline tool has why-depends, but it only works for packages that build successfully and also it doesn't show you for what reason a package ended up in in the closure.
While debugging this problem yesterday, I decided to create my own helper library. Check out https://github.com/DavHau/nix-toolbox if you like.
There is not a lot inside yet, but it has a function whyDepends
. With this you can see why six ends up in the closure of google-auth.
I'm also planning on releasing a new mach-nix version, which makes it a bit simpler to get ahold of the underlying generated nix expression and dependency tree.
Other than that, whenever I need to debug the python code of mach-nix I use ./debug/debug.py, which executes mach-nix outside of nix-build, so you can hook in a debugger.
This should be fixed in nixpkgs, but we could also consider to improve mach-nix to fix such mistakes automatically.
I think it's great to automatically find these kind of discrepancies so that they can be fixed, but an automated fix that just makes the problem go away entirely sounds like it could keep those issues buried and create a discrepancy between the written requirements and what actually happens.
BTW, I just noticed, that mach-nix does in fact attempt to fix missing dependencies automatically (see this function)
The problem why it didn't work in your case, is because the pypiData used is to old to contain the recent tensorflow 2.4.0 found in nixpkgs. If you update the flake input pypi-deps-db of mach-nix to a newer version, it should work without having to fix it manually.
But I also included a permanent fix for google-auth now.
Thanks for your PR. Why is it a draft?
I only verified that this indeed fixed my specific issue and that google-auth still builds afterwards, but I could not get it to fail building without the fix yet. I can remove the draft flag if you think it's fine like that. Debugging dependency trees in nix ins't an easy thing. The nix cmdline tool has why-depends, but it only works for packages that build successfully and also it doesn't show you for what reason a package ended up in in the closure.
While debugging this problem yesterday, I decided to create my own helper library. Check out https://github.com/DavHau/nix-toolbox if you like. There is not a lot inside yet, but it has a function
whyDepends
. With this you can see why six ends up in the closure of google-auth.I just tried it and this looks really interesting. Thanks for publishing it! I will test it some more on my future problems.
I'm also planning on releasing a new mach-nix version, which makes it a bit simpler to get ahold of the underlying generated nix expression and dependency tree.
Other than that, whenever I need to debug the python code of mach-nix I use ./debug/debug.py, which executes mach-nix outside of nix-build, so you can hook in a debugger.
I had not seen that.
This should be fixed in nixpkgs, but we could also consider to improve mach-nix to fix such mistakes automatically.
I think it's great to automatically find these kind of discrepancies so that they can be fixed, but an automated fix that just makes the problem go away entirely sounds like it could keep those issues buried and create a discrepancy between the written requirements and what actually happens.
BTW, I just noticed, that mach-nix does in fact attempt to fix missing dependencies automatically (see this function)
The problem why it didn't work in your case, is because the pypiData used is to old to contain the recent tensorflow 2.4.0 found in nixpkgs. If you update the flake input pypi-deps-db of mach-nix to a newer version, it should work without having to fix it manually.
But I also included a permanent fix for google-auth now.
Oh, yeah. It adds dependencies that it got from other providers.
Does it only do that for nixpkgs? I wonder if that logic adding something is always a bug in nixpkgs or if there are intentional discrepancies sometimes.
It's also interesting to think that this logic is part of the interface.
In this case something this logic does should land in nixpkgs.
Does it only do that for nixpkgs? I wonder if that logic adding something is always a bug in nixpkgs or if there are intentional discrepancies sometimes.
It only does that for nixpkgs. For sdist or wheel providers there is no need to fix
anything, since the dependencies are taken from the database directly. It's the best information we have. If a package doesn't declare it's dependencies on pypi, there is not much we can do, other then including a custom patch for it.
In nixpkgs there are also intentional discrepancies. In nixpkgs focus lies to some extend on reducing the amount of different package versions. Therefore for some packages, wrong library versions + patches are used. But since mach-nix modifies the package set significantly, we cannot tell if all the hacks inside nixpkgs still work correctly. I think it's better to replace all deps with correct versions.
In this case something this logic does should land in nixpkgs.
I think, the thing that could be improved on nixpkgs, is to clearly separate building from testing into two separate derivations, so that test time deps are not part of the build. In this case, if you disable tests, it won't change anything with the build.
Currently thinking what we could do to make it more clear to the user when mach-nix cannot find the dependencies of a nixpkgs package in the pypi data. I have the following in mind:
I think option 3 would probably be the best approach, since an outdated pypiData is usually the reason for this problem.
For people using mach-nix from a flake it probably helps a lot to use the following pattern to avoid having an outdated version of pypi-deps-db in the first place.
Since
pypi-deps-db
is an input to the mach-nix flake, you should add an explicit dependency topypi-deps-db
to your flake and keep it up to date with the latest version ofpypi-deps-db
, so that dependency resolution can rely on the latest available version information for dependency resolution.{ ... inputs = { nixpkgs.url = "github:nixos/nixpkgs/nixpkgs-unstable"; pypi-deps-db = { url = "github:DavHau/pypi-deps-db"; flake = false; }; mach-nix = { url = "github:DavHau/mach-nix/3.1.1"; inputs.pipy-deps-db.url = "pypi-deps-db"; }; }; ... }
Of course this is only helps users that are using flakes, but it does the right thing for nix flake update
which is quite nice. And I think its a somewhat natural way to express that you want to keep a transitive dependency up to date. I also use it to nixpkgs and flake-utils up to date in my mach-nix projects.
PS: have not had time yet to look into the remaining aarch64-linux issues yet
The cause of the illegal instruction errors I described above https://github.com/DavHau/mach-nix/issues/240#issuecomment-785874276 is indeed the same as described in the linked numpy
issue.
Effectively things that depend on nixpkgs's openblas
, like nixpkg's numpy
probably fail with an illegal instruction error on aarch64
right now.
This should be fixed when https://github.com/NixOS/nixpkgs/pull/117004 lands in nixpkgs as it fixes the bad machine code that causes the issue.
I'm building tensorflow on my Jetson Nano right now.I will update this issue when I know if things are working now.
Since https://github.com/NixOS/nixpkgs/pull/117004 was merged, to staging and then eventually to master, the test project in my repo is building now. It takes quite some time to build tensorflow for aarch64-linux
though. I have not checked again if I can get a binary artifact from some provider.
I think this can be closed now so I'm doing that. Thanks for your help @DavHau.
I'm trying to get TensorFlow to work inside a Nix-based environment on an
aarch64-linux
system by usingmach-nix
.I created a flake for this, which works fine on
x86_64-linux
but cannot resolve all of its dependencies onaarch64-linux
. I'm not sure how to best deal with those missing dependencies and I was hoping someone can point me in the right direction.The specific system I am trying this on is a Jetson Nano running Nvidia's Ubuntu-based distro.
I have created a repo with a minimal testcase for running TensorFlow here: https://github.com/mschwaig/tensorflow-hello-world-nix-flake.
This works on
x86_64-linux
and builds properly runnable thing, but onaarch64-linux
bothnix build
andnix develop
fail like this: