DavHau / mach-nix

Create highly reproducible python environments
MIT License
862 stars 106 forks source link

pytorchWithCuda possible? #213

Open timsears opened 3 years ago

timsears commented 3 years ago

In the following code, torch resolves to the package pytorch. Is there a way to make it resolve to pytorchWithCuda instead?

devShell = mach-nix.mkPythonShell rec {
  requirements = ''
      torch
      jupyterlab
      torchvision
      matplotlib
   '';

   providers = {
     torch = "nixpkgs";
   };
};
timsears commented 3 years ago

A complete example is given here

It uses pytorchWIthCuda but yields the error:

Automatic extraction of 'pname' from python package source /nix/store/hfgww019flw6byvczdiwcz2qm13fgg7zpython3.7-pytorch-1.7.1 failed.
Please manually specify 'pname' 

I succeed in getting an environment with a gpu-enabled pytorch with something like this ...

        myPython = (pkgs.python37.withPackages (p: with p; [
          pytorchWithCuda
          jupyterlab
          torchvision
          matplotlib
        ])).override (_ : { ignoreCollisions = true; });

        myShell = pkgs.mkShell rec {

          buildInputs = [
            myPython
            pkgs.conda
          ];

          shellHook = ''
            jupyter lab --notebook-dir=~/
          '';
        };

...However it doesn't use mach-nix.

The complete code is here (same repo, different branch)

Not sure how to supply the desired pname or why this issue pops up.

InLaw commented 3 years ago

torch seems to be one pkgs that needs more effort the other pypi pkgs

InLaw commented 3 years ago

pname is an issue in some cases

DavHau commented 3 years ago

You could just use overridesPost in mach-nix to enable cuda for pytorch:

let
  mach-nix = import (
    builtins.fetchGit {
      url = "https://github.com/DavHau/mach-nix/";
      ref = "refs/tags/3.1.1";
    }
  ) {
    python = "python37";
  };
in
mach-nix.mkPython {
  requirements = ''
    torch
  '';
  providers.torch = "nixpkgs";
  overridesPost = [(curr: prev: {
    torch = prev.torch.override {
      cudaSupport = true;
    };
  })];
}

I cannot verify, since I don't have a GPU available right now.

Should we add this to the examples.md?

InLaw commented 3 years ago

3.1.1 / 3.2.0 results in:

-- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT - Success
Traceback (most recent call last):
  File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/build/source/tools/codegen/gen.py", line 14, in <module>
    from tools.codegen.model import *
  File "/build/source/tools/codegen/model.py", line 39, in <module>
    @dataclass(frozen=True)
  File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 950, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
  File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 801, in _process_class
    for name, type in cls_annotations.items()]
  File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 801, in <listcomp>
    for name, type in cls_annotations.items()]
  File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 659, in _get_field
    if (_is_classvar(a_type, typing)
  File "/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6/lib/python3.7/site-packages/dataclasses.py", line 550, in _is_classvar
    return type(a_type) is typing._ClassVar
AttributeError: module 'typing' has no attribute '_ClassVar'
--
CMake Error at cmake/Codegen.cmake:202 (message):
  Failed to get generated_cpp list
Call Stack (most recent call first):
  caffe2/CMakeLists.txt:2 (include)

-- Configuring incomplete, errors occurred!
See also "/build/source/build/CMakeFiles/CMakeOutput.log".
See also "/build/source/build/CMakeFiles/CMakeError.log".
Traceback (most recent call last):
  File "setup.py", line 717, in <module>
    build_deps()
  File "setup.py", line 313, in build_deps
    cmake=cmake)
  File "/build/source/tools/build_pytorch_libs.py", line 59, in build_caffe2
    rerun_cmake)
  File "/build/source/tools/setup_helpers/cmake.py", line 329, in generate
    self.run(args, env=my_env)
  File "/build/source/tools/setup_helpers/cmake.py", line 141, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '-GNinja', '-DBUILD_DOCS=', '-DBUILD_NAMEDTENSOR=1', '-DBUILD_PYTHON=True', '-DBUILD_TEST=True', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INCLUDE_PATH=/nix/store/x6fbphhsaj71ssb4dhy3xpr0v69sxafq-blas-3-dev/include:/nix/store/7y59s8b8mi42w1v7p2qmwcd5jbj57sqz-openblas-0.3.12-dev/include:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14/include:/nix/store/i7nss4wvb4i3458zccbpq9lf478mhz49-libffi-3.3-dev/include:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include', '-DCMAKE_INSTALL_PREFIX=/build/source/torch', '-DCMAKE_LIBRARY_PATH=/nix/store/fm4y1vb1kjf8z1zy82mmhz88dj7dff36-blas-3/lib:/nix/store/njh6a13rlg0nq4hhvarw8smzzjr6jjq5-openblas-0.3.12/lib:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14/lib:/nix/store/dzwq4mnbaj5f30gkrc80618l6xlbzwdj-libffi-3.3/lib:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib:/nix/store/cnzvlisa06k5rl9a9z8zmjkn7ba6bp2n-libGL-1.3.2/lib:/nix/store/4s6bvh4c39fb1ihd0966j6f72zirf576-libICE-1.0.10/lib:/nix/store/y20lx0kjwf3slxknrc40zdifjlfh4ijh-libSM-1.2.3/lib:/nix/store/cmzbw5bk1yva5zk4y61jjz9l3190q7a5-libX11-1.6.12/lib:/nix/store/622b1nj4bqhx8vl56215vp7b7apxn5px-libXext-1.3.4/lib:/nix/store/0k979a89ix8xz02jid2g475pwwclzp0c-libXrender-0.9.10/lib:/nix/store/a6rnjp15qgp8a699dlffqj94hzy1nldg-glibc-2.32/lib:/nix/store/0ds5gvys9awz8ab2mybyfhy7532yrhxa-glib-2.66.2/lib:/nix/store/7vaig2i7pna01zygjx4ij3ci5phyhlan-ncurses-6.2-abi5-compat/lib:/nix/store/51hq0xxp9nng3xxfz7dpkhb9lzy7sz84-gcc-9.3.0-lib/lib', '-DCMAKE_PREFIX_PATH=/nix/store/x6fbphhsaj71ssb4dhy3xpr0v69sxafq-blas-3-dev:/nix/store/fm4y1vb1kjf8z1zy82mmhz88dj7dff36-blas-3:/nix/store/7y59s8b8mi42w1v7p2qmwcd5jbj57sqz-openblas-0.3.12-dev:/nix/store/njh6a13rlg0nq4hhvarw8smzzjr6jjq5-openblas-0.3.12:/nix/store/cvalgpm8km90bc2rxd1xzbjz4a6srvky-numactl-2.0.14:/nix/store/1s1jrg4c78psbv2jzwz7s168z1sbk9bf-python3.7-cffi-1.14.3-dev:/nix/store/i7nss4wvb4i3458zccbpq9lf478mhz49-libffi-3.3-dev:/nix/store/dzwq4mnbaj5f30gkrc80618l6xlbzwdj-libffi-3.3:/nix/store/6giaa06r2dj7hnmlrdp8i3707m02ypx0-python3.7-pycparser-2.20:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9:/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9:/nix/store/39pg386bn9v8s7k8vxilg6qw1sbyg05n-python3.7-cffi-1.14.3:/nix/store/d7pmiypga59psnv6g3rax0yw2f2yy6f7-python3.7-click-7.1.2:/nix/store/w6fpl4v7s0ychkg9vrzr3lal5nqmzm8b-python3.7-numpy-1.19.4:/nix/store/cnzvlisa06k5rl9a9z8zmjkn7ba6bp2n-libGL-1.3.2:/nix/store/4s6bvh4c39fb1ihd0966j6f72zirf576-libICE-1.0.10:/nix/store/y20lx0kjwf3slxknrc40zdifjlfh4ijh-libSM-1.2.3:/nix/store/cmzbw5bk1yva5zk4y61jjz9l3190q7a5-libX11-1.6.12:/nix/store/622b1nj4bqhx8vl56215vp7b7apxn5px-libXext-1.3.4:/nix/store/0k979a89ix8xz02jid2g475pwwclzp0c-libXrender-0.9.10:/nix/store/a6rnjp15qgp8a699dlffqj94hzy1nldg-glibc-2.32:/nix/store/0ds5gvys9awz8ab2mybyfhy7532yrhxa-glib-2.66.2:/nix/store/7vaig2i7pna01zygjx4ij3ci5phyhlan-ncurses-6.2-abi5-compat:/nix/store/51hq0xxp9nng3xxfz7dpkhb9lzy7sz84-gcc-9.3.0-lib:/nix/store/ys476m00hh1c6maaq3zpqpfqahacwyp3-python3.7-PyYAML-5.3.1:/nix/store/c3620zk6b3qy42yxv070jngarif0lr9c-python3.7-typing-extensions-3.7.4.3:/nix/store/s434dgl2vx5w1nfjvhypgmrnv2di0zjp-python3.7-Pillow-7.2.0:/nix/store/cxj8cvc7rf2jh792cjfsdjps35lprjgs-python3.7-olefile-0.46:/nix/store/8fpdmdk0sm5alpigxmf0smig0ca6bljk-python3.7-six-1.15.0:/nix/store/axj1zmza6m6w3kdj2nnnjwmklklzz3x7-python3.7-future-0.18.2:/nix/store/jprkahny25dzapj0z7i4kj3cv0rm3hj3-python3.7-tensorflow-tensorboard-1.15.0:/nix/store/xy2kprv7l768jqsazpbyx86sv8nwhjdm-python3.7-Werkzeug-1.0.1:/nix/store/9b1dd71knik83rjjvzgxgf41syi7wf1w-python3.7-itsdangerous-1.1.0:/nix/store/0j51lpyxyyjjqz43najzcaqic08na9ck-python3.7-protobuf-3.14.0-dev:/nix/store/fa5bycnsfkdlms8vv6z594ydvnyc6p4l-python3.7-google-apputils-0.4.2:/nix/store/d5b4mls4fjh1i5yc9851cl84hbyljcpq-python3.7-pytz-2020.1:/nix/store/y490dbqkxh8rxxnp5v56kpic25fc9n5n-python3.7-python-gflags-3.1.2:/nix/store/ky0dgzjxa5d03wxapb4w47g07rfnh27i-python3.7-python-dateutil-2.8.1:/nix/store/ngk96p2yjwf6l5221vh4665hk2d357qm-python3.7-setuptools_scm-4.1.2:/nix/store/6pl98g9cxlwnxh9lzhs88bsznlvrxgzg-python3.7-mox-0.5.3:/nix/store/gkbd359wym4dzxs5jcpb30ma3rdb042s-python3.7-protobuf-3.14.0:/nix/store/c88vj9z0wgzpswi9jwakfawcjkbd236i-python3.7-Markdown-3.2.2:/nix/store/qd0xh9kqz3srdbr53pgl9s7xbdkiyc18-python3.7-setuptools-47.3.1:/nix/store/lj9iscpl3ppj2x8a0i2dl3hll03mxzyj-python3.7-importlib-metadata-1.7.0:/nix/store/zbxm6aqvsdc2vxzg766hfbq0dr71m8db-python3.7-zipp-3.1.0:/nix/store/xr55w2kx636zm7s93q8h77lv4zmcawwb-python3.7-more-itertools-8.4.0:/nix/store/myrwv9hvw93m0421krkkrx7mss394a3f-python3.7-grpcio-1.33.2:/nix/store/6v946n068m8xbrpil2a1k0k1vpbgw3sd-python3.7-absl-py-0.9.0:/nix/store/dxdd6wjbcipfzifw3ik330jym9k3y4dl-python3.7-wheel-0.34.2:/nix/store/afswsnr2198p7c5np5b939ki49c3z63s-python3.7-dataclasses-0.6', '-DNUMPY_INCLUDE_DIR=/nix/store/w6fpl4v7s0ychkg9vrzr3lal5nqmzm8b-python3.7-numpy-1.19.4/lib/python3.7/site-packages/numpy/core/include', '-DPYTHON_EXECUTABLE=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/bin/python3.7', '-DPYTHON_INCLUDE_DIR=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/include/python3.7m', '-DPYTHON_LIBRARY=/nix/store/w5rhnscn5jxxy4ichr6f3gmv0mpa4x8g-python3-3.7.9/lib/libpython3.7m.so.1.0', '-DTORCH_BUILD_VERSION=1.7.0', '-DUSE_MKL=', '-DUSE_MKLDNN=1', '-DUSE_MKLDNN_CBLAS=1', '-DUSE_NUMPY=True', '-DUSE_SYSTEM_NCCL=1', '/build/source']' returned non-zero exit status 1.
wucke13 commented 2 years ago

So I have torch with cuda kind of working, using the following flake:

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
  inputs.flake-utils.url = "github:numtide/flake-utils";
  inputs.mach-nix.url = "github:DavHau/mach-nix";
  inputs.yolov7.url = "github:WongKinYiu/yolov7";
  inputs.yolov7.flake = false;

  outputs = { self, nixpkgs, flake-utils, mach-nix, ... } @ inputs:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = import nixpkgs {
          inherit system;
          config = {
            allowUnfree = true;
            cudaSupport = true;
          };
        };
        python = "python310";
        machNix = import mach-nix {
          inherit python;
          inherit pkgs;
        };
        python-yolo-env = machNix.mkPython rec {
          requirements = builtins.readFile (inputs.yolov7 + "/requirements.txt"); #builtins.readFile ./requirements.txt;
          providers = {
            _default = "nixpkgs";

            opencv-python = "wheel";
            thop = "sdist";
          };
        };
      in
      rec {
        packages = {
          inherit python-yolo-env;
        };
        devShells.default = pkgs.mkShell {
          nativeBuildInputs = [ python-yolo-env ];
        };
      }
    );
}

However, there are a few quirks/bugs:

/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
  warn(f"Failed to load image Python extension: {e}")

This is particularly wild, since

sepiabrown commented 1 year ago
* '/nix/store/m05fwxipvjc51b411p2gj5djqz0c7apb-python3-3.10.5-env/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so' exists (note the difference, the `libtorch_cuda_cu.so` is **not** in the `torch/lib` directory).

Those files are generated under torch when it is built with BUILD_SPLIT_CUDA=1 similar to BUILD_NAMEDTENSOR = setBool true; in torch. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864)

So you have to put BUILD_SPLIT_CUDA=1 using overridePythonAttrs.

sdutwsl commented 2 months ago

I use the following flake.nix makes it work, is there any better way today?

{
  description = "Python shell flake";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";

    mach-nix.url = "github:davhau/mach-nix";

    pypi-deps-db = {
      url = "github:DavHau/pypi-deps-db?rev=ba35683c35218acb5258b69a9916994979dc73a9";
      inputs.mach-nix.follows = "mach-nix";
    };
  };

  outputs = { self, nixpkgs, mach-nix, flake-utils, ... }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
        mach = mach-nix.lib.${system};

        pythonEnv = mach.mkPython {
     requirements = builtins.readFile ./requirements.txt;    providers.torch = "nixpkgs";
     overridesPost = [(curr: prev: {
       torch = prev.torch.override {
         cudaSupport = true;
       };
     })];
        };
      in
      {
        devShells.default = pkgs.mkShellNoCC {
          packages = [ pythonEnv ];

          shellHook = ''
            export PYTHONPATH="${pythonEnv}/bin/python"
          '';
        };
      }
    );
}

requirements.txt

setuptools
torch
torchvision
torchaudio