conda-forge / pytorch-cpu-feedstock

A conda-smithy repository for pytorch-cpu.
BSD 3-Clause "New" or "Revised" License
17 stars 43 forks source link

up to 2.0.1 #172

Closed ngam closed 10 months ago

ngam commented 1 year ago

Checklist

conda-forge-webservices[bot] commented 1 year ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

ngam commented 1 year ago

@conda-forge-admin, please rerender

RaulPPelaez commented 1 year ago

GCC 10 builds fail due to OneDNN:

  /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1686968646633/work/torch/csrc/jit/codegen/onednn/graph_helper.h:3:10: fatal error: oneapi/dnnl/dnnl_graph.hpp: No such file or directory
      3 | #include <oneapi/dnnl/dnnl_graph.hpp>
        |   

OneDNN does not seem to be listed in the requirements for build: https://github.com/conda-forge/pytorch-cpu-feedstock/blob/f2474772cb070bd3131bbb349e0d6f856c79747a/recipe/meta.yaml GCC 12 builds fail because of an incompatibility between FBGMM and GCC 12: https://github.com/pytorch/pytorch/issues/77939 https://github.com/pytorch/FBGEMM/issues/1666 Apparently the error is a false positive and can be ignored. Users got around this error by just silencing maybe-uninitialized (this is recommended in FBGEMM`s README now):

export CXXFLAGS+='-Wno-maybe-uninitialized -Wno-uninitialized -Wno-free-nonheap-object -Wno-nonnull' 
export CFLAGS+='-Wno-maybe-uninitialized -Wno-uninitialized -Wno-free-nonheap-object -Wno-nonnull'

I wanna help, but I do not really know how a feedstock works -.-

ngam commented 1 year ago

Thanks, I will follow up with your suggested fixes soon.

We all learn the conda-forge ways by watching 😉

github-actions[bot] commented 1 year ago

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/pytorch-cpu-feedstock/actions/runs/5315469498.

ngam commented 1 year ago

We still have the onednn issue. Let me think how we address this......

RaulPPelaez commented 1 year ago

Inspecting the first lines of the failed builds:

## Package Plan ##

  environment location: /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687227140059/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh

The following NEW packages will be INSTALLED:

    _libgcc_mutex:      0.1-conda_forge            conda-forge
    _openmp_mutex:      4.5-2_kmp_llvm             conda-forge
    brotli:             1.0.9-h166bdaf_8           conda-forge
    brotli-bin:         1.0.9-h166bdaf_8           conda-forge
    bzip2:              1.0.8-h7f98852_4           conda-forge
    ca-certificates:    2023.5.7-hbcca054_0        conda-forge
    certifi:            2023.5.7-pyhd8ed1ab_0      conda-forge
    charset-normalizer: 3.1.0-pyhd8ed1ab_0         conda-forge
    cuda-version:       11.8-h70ddcb2_2            conda-forge
    cudatoolkit:        11.8.0-h37601d7_11         conda-forge
    cudnn:              8.8.0.121-h0800d71_1       conda-forge
    future:             0.18.3-pyhd8ed1ab_0        conda-forge
    icu:                72.1-hcb278e6_0            conda-forge
    idna:               3.4-pyhd8ed1ab_0           conda-forge
    ld_impl_linux-64:   2.40-h41732ed_0            conda-forge
    libblas:            3.9.0-16_linux64_mkl       conda-forge
    libbrotlicommon:    1.0.9-h166bdaf_8           conda-forge
    libbrotlidec:       1.0.9-h166bdaf_8           conda-forge
    libbrotlienc:       1.0.9-h166bdaf_8           conda-forge
    libcblas:           3.9.0-16_linux64_mkl       conda-forge
    libffi:             3.4.2-h7f98852_5           conda-forge
    libgcc-ng:          13.1.0-he5830b7_0          conda-forge
    libgomp:            13.1.0-he5830b7_0          conda-forge
    libhwloc:           2.9.1-nocuda_h7313eea_6    conda-forge
    libiconv:           1.17-h166bdaf_0            conda-forge
    liblapack:          3.9.0-16_linux64_mkl       conda-forge
    libmagma:           2.7.1-hc72dce7_3           conda-forge
    libmagma_sparse:    2.7.1-hc72dce7_4           conda-forge
    libnsl:             2.0.0-h7f98852_0           conda-forge
    libprotobuf:        3.21.12-h3eb15da_0         conda-forge
    libsqlite:          3.42.0-h2797004_0          conda-forge
    libstdcxx-ng:       13.1.0-hfd8a6a1_0          conda-forge
    libuuid:            2.38.1-h0b41bf4_0          conda-forge
    libuv:              1.44.2-h166bdaf_0          conda-forge
    libxml2:            2.11.4-h0d562d8_0          conda-forge
    libzlib:            1.2.13-hd590300_5          conda-forge
    llvm-openmp:        16.0.6-h4dfa4b3_0          conda-forge
    magma:              2.7.1-ha770c72_4           conda-forge
    mkl:                2022.2.1-h84fe81f_16997    conda-forge
    mkl-devel:          2022.2.1-ha770c72_16998    conda-forge
    mkl-include:        2022.2.1-h84fe81f_16997    conda-forge
    nccl:               2.18.3.1-h12f7317_0        conda-forge
    ncurses:            6.4-hcb278e6_0             conda-forge
    numpy:              1.21.6-py310h45f3432_0     conda-forge
    openssl:            3.1.1-hd590300_1           conda-forge
    pip:                23.1.2-pyhd8ed1ab_0        conda-forge
    pkg-config:         0.29.2-h36c2ea0_1008       conda-forge
    pysocks:            1.7.1-pyha2e5f31_6         conda-forge
    python:             3.10.11-he550d4f_0_cpython conda-forge
    python_abi:         3.10-3_cp310               conda-forge
    pyyaml:             6.0-py310h5764c6d_5        conda-forge
    readline:           8.2-h8228510_1             conda-forge
    requests:           2.31.0-pyhd8ed1ab_0        conda-forge
    setuptools:         67.7.2-pyhd8ed1ab_0        conda-forge
    six:                1.16.0-pyh6c4a22f_0        conda-forge
    sleef:              3.5.1-h9b69904_2           conda-forge
    tbb:                2021.9.0-hf52228f_0        conda-forge
    tk:                 8.6.12-h27826a3_0          conda-forge
    typing:             3.10.0.0-pyhd8ed1ab_0      conda-forge
    typing_extensions:  4.6.3-pyha770c72_0         conda-forge
    tzdata:             2023c-h71feb2d_0           conda-forge
    urllib3:            2.0.3-pyhd8ed1ab_0         conda-forge
    wheel:              0.40.0-pyhd8ed1ab_0        conda-forge
    xz:                 5.2.6-h166bdaf_0           conda-forge
    yaml:               0.2.5-h7f98852_2           conda-forge
    zstd:               1.5.2-h3eb15da_6           conda-forge

OneDNN is not there

RaulPPelaez commented 1 year ago

I see you added mkl_include to meta.yml, but this is a different package with different headers and names. Pytorch has the -DUSE_MKLDNN CMake flag to use mkl instead of the newer oneAPI, but there has been effort to translate to oneAPI in pytorch: https://github.com/pytorch/pytorch/pull/32422 So I guess we should use that. The conda-forge package is just "onednn" and includes the missing header.

ngam commented 1 year ago

I tried adding onednn in https://github.com/conda-forge/pytorch-cpu-feedstock/pull/172/commits/48b28139fb57276fe18212a425c4b677888e7300 and it didn’t work. I got the mkl-include idea from looking at how pytorch builds their own conda package (no onednn, but they use mkl-include)

RaulPPelaez commented 1 year ago

What was the error when you added onednn? I cannot see the log.

ngam commented 1 year ago

My plan was to add onednn again if mkl-include didn't work (i.e., adding one piece at a time)

RaulPPelaez commented 1 year ago

The compiler finds oneDNN, but eventually it fails with this error:

  /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687262527413/work/torch/csrc/jit/codegen/onednn/operator.h:98:15: error: no matching function for call to 'dnnl::graph::op::set_attr(std::string&, std::__cxx11::basic_string<char>)'
  /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687262527413/work/torch/csrc/jit/codegen/onednn/operator.h:98:16: note:   cannot convert 'name' (type 'std::string' {aka 'std::__cxx11::basic_string<char>'}) to type 'dnnl::graph::op::attr'

It looks to me like a version mismatch between oneDNN and pytorch. Honestly no idea how to solve it -.- Sadly I cannot find any references to it online.

These official tips are 4 years old, but its the best I could find. https://github.com/pytorch/pytorch/blame/main/CONTRIBUTING.md#c-development-tips

EDIT: I found these: https://github.com/pytorch/pytorch/pull/103745 https://github.com/pytorch/pytorch/pull/97957 Which makes me think that onednn should be included in third_party automatically. My head hurts -.-

ngam commented 1 year ago

Which makes me think that onednn should be included in third_party automatically. My head hurts -.-

That was gonna be the next question --- why didn't we encounter this problem before? I am not sure what else is changing, but in our end, it seems nothing changed since the last build. Is there anything that could be related in here: https://github.com/pytorch/pytorch/compare/v2.0.0...v2.0.1

ngam commented 1 year ago

I found these: pytorch/pytorch#103745 pytorch/pytorch#97957

We can either:

  1. Apply one or both these patches to this PR
  2. use the older onednn version these PRs discuss (currently we are pulling the newer onednn)

Do you have a preference?

RaulPPelaez commented 1 year ago

Which makes me think that onednn should be included in third_party automatically. My head hurts -.-

That was gonna be the next question --- why didn't we encounter this problem before? I am not sure what else is changing, but in our end, it seems nothing changed since the last build. Is there anything that could be related in here: pytorch/pytorch@v2.0.0...v2.0.1

As far as I can tell there is nothing in that diff messing with the building process.

RaulPPelaez commented 1 year ago

Do you have a preference?

I would use the older DNN

RaulPPelaez commented 1 year ago

OTOH the CPU builds work, and I believe it is due to this USE_MKLDNN here: https://github.com/conda-forge/pytorch-cpu-feedstock/blob/6b2d2b8b4fea88916ff6c9aef810216f0bb666e8/recipe/build_pytorch.sh#L128-L139 This branch only runs when cuda compiler is None, so maybe that makes it so oneDNN is not even needed.

ngam commented 1 year ago

Smart! Let’s add MKLDNN to the cuda ones too. I will edit the PR with this first, then the older onednn if it doesn’t work

ngam commented 1 year ago

Btw, do you know if they added support for grace hopper gpus yet? If so, we may need to edit the compute list we have (TORCH_CUDA_ARCH_LIST). I can double check once everything passes.

RaulPPelaez commented 1 year ago

That is above my pay grade, dude. I only get up to 4090's, not complaining, though :P I learned how to build locally and tried setting the MKLDNN flag, still failed with the same error. I am now trying again but also removing onednn as a dependency in meta.yaml. Perhaps it is interfering somehow. I checked and pytorch comes with onednn via the submodule "ideep" https://github.com/intel/ideep/tree/fe8378249600442043b98f333b8b605bedca5a25 . It says mkl-dnn, but if you click it redirects you to onednn. Fun stuff.

Also, I believe your diff is not correct. pytorch@v2.0.0 points to the tag v2.0.0-rc6, which I bet is more recent than the tag that was used to build current v2.0.0, correct me if I am wrong. Judging by the dates in the feedstock we should be comparing these: https://github.com/pytorch/pytorch/compare/v2.0.0-rc4...v2.0.1

ngam commented 1 year ago

If this keeps failing, I will consider moving to testing cuda 12. Let me know if you can find anything useful in the diffs. I will take care of the newer gpus stuff 😝

RaulPPelaez commented 1 year ago

Building locally with MKLDNN=1 and removing onednn from dependencies yields the same error:

In file included from /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687351558668/work/torch/csrc/jit/codegen/onednn/graph_fuser.h:3,
                   from /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687351558668/work/torch/csrc/jit/codegen/onednn/graph_fuser.cpp:1:
  /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1687351558668/work/torch/csrc/jit/codegen/onednn/graph_helper.h:3:10: fatal error: oneapi/dnnl/dnnl_graph.hpp: No such file or directory
      3 | #include <oneapi/dnnl/dnnl_graph.hpp>
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.
RaulPPelaez commented 1 year ago

The file is located at third_party/ideep/mkl-dnn/include/oneapi/dnnl/dnnl_graph.hpp However, this directory is not included when compiling that file:

$ cat log.txt  | grep "graph_fuser.cpp" | grep "c++" | tail -1 | tr ' ' '\n' | grep '\-I' | cut -d/ -f8-
build/aten/src
aten/src
build

third_party/onnx
build/third_party/onnx
third_party/foxi
build/third_party/foxi
torch/csrc/api
torch/csrc/api/include
caffe2/aten/src/TH
build/caffe2/aten/src/TH
build/caffe2/aten/src
build/caffe2/../aten/src
torch/csrc
third_party/miniz-2.1.0
third_party/kineto/libkineto/include
aten/../third_party/catch/single_include
aten/src/ATen/..
third_party/FXdiv/include
c10/..
third_party/pthreadpool/include
third_party/cpuinfo/include
third_party/QNNPACK/include
aten/src/ATen/native/quantized/cpu/qnnpack/include
aten/src/ATen/native/quantized/cpu/qnnpack/src
third_party/cpuinfo/deps/clog/include
third_party/NNPACK/include
third_party/fbgemm/include
third_party/fbgemm
third_party/fbgemm/third_party/asmjit/src
third_party/FP16/include
third_party/tensorpipe
build/third_party/tensorpipe
third_party/tensorpipe/third_party/libnop/include
third_party/fmt/include
third_party/flatbuffers/include

The CMake output warns about this:

  -- Will build oneDNN Graph
  -- MKLDNN source files not found!
  CMake Warning at cmake/Dependencies.cmake:1762 (message):
    MKLDNN could not be found.
  Call Stack (most recent call first):
    CMakeLists.txt:717 (include)

The files are there in my local build_artifact and the FindMKLDNN.cmake file looks correct to me:

IF(NOT MKLDNN_FOUND)
  SET(MKLDNN_LIBRARIES)
  SET(MKLDNN_INCLUDE_DIR)

  SET(IDEEP_ROOT "${PROJECT_SOURCE_DIR}/third_party/ideep")
  SET(MKLDNN_ROOT "${PROJECT_SOURCE_DIR}/third_party/ideep/mkl-dnn/third_party/oneDNN")
  IF(NOT APPLE AND NOT WIN32 AND NOT BUILD_LITE_INTERPRETER)
    MESSAGE("-- Will build oneDNN Graph")
    SET(LLGA_ROOT "${PROJECT_SOURCE_DIR}/third_party/ideep/mkl-dnn")
    SET(BUILD_ONEDNN_GRAPH ON)
  ENDIF(NOT APPLE AND NOT WIN32 AND NOT BUILD_LITE_INTERPRETER)

  FIND_PACKAGE(BLAS)
  FIND_PATH(IDEEP_INCLUDE_DIR ideep.hpp PATHS ${IDEEP_ROOT} PATH_SUFFIXES include)
  FIND_PATH(MKLDNN_INCLUDE_DIR dnnl.hpp dnnl.h PATHS ${MKLDNN_ROOT} PATH_SUFFIXES include)
  IF(NOT MKLDNN_INCLUDE_DIR)
    EXECUTE_PROCESS(COMMAND git${CMAKE_EXECUTABLE_SUFFIX} submodule update --init mkl-dnn WORKING_DIRECTORY ${IDEEP_ROOT})
    FIND_PATH(MKLDNN_INCLUDE_DIR dnnl.hpp dnnl.h PATHS ${MKLDNN_ROOT} PATH_SUFFIXES include)
  ENDIF(NOT MKLDNN_INCLUDE_DIR)
  IF(BUILD_ONEDNN_GRAPH)
    FIND_PATH(LLGA_INCLUDE_DIR oneapi/dnnl/dnnl_graph.hpp PATHS ${LLGA_ROOT} PATH_SUFFIXES include)
  ENDIF(BUILD_ONEDNN_GRAPH)

  IF(NOT IDEEP_INCLUDE_DIR OR NOT MKLDNN_INCLUDE_DIR)
    MESSAGE(STATUS "MKLDNN source files not found!")
    RETURN()
  ENDIF(NOT IDEEP_INCLUDE_DIR OR NOT MKLDNN_INCLUDE_DIR)

The dnnl.hpp files it is looking for are definitely there. And even if the submodule was not there for some reason, it tries to update it right there. The plot thickens...

ngam commented 1 year ago

Maybe we can set MKLDNN_INCLUDE_DIR?

ngam commented 1 year ago

Also, looks like the CI has gone further now? How are you building locally btw? Are you using the build_locally.py script?

RaulPPelaez commented 1 year ago

Yes, I am using the build_locally.py script. Its fast because you can build only one thing and I can set MAX_JOBS to compile in parallel. Sucks because it clones pytorch every time -.-

Maybe we can set MKLDNN_INCLUDE_DIR?

Perhaps by replacing the pip build command with something like:

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py build --cmake-only
cmake build -DTHIS_AND_THAT
cd build && make all install

Otherwise AFAIK you cannot influence CMake variables.

RaulPPelaez commented 1 year ago

Anyhow it looks like the CI was able to compile the offending files, its almost finished actually. Maybe we are lucky and thats it...

ngam commented 1 year ago

Don’t hold your breath! The cuda builds won’t finish in 6 hours. What we are hoping for here is that they time out


ngam commented 1 year ago

Well damn
 they’re finishing in time. Could you test one of the gpu artifacts to see if they work fine locally? Let me know if you don’t know how to do that

ngam commented 1 year ago

@conda-forge/pytorch-cpu @hmaarrfk @h-vetinari @Tobias-Fischer, since when are we able to build cuda112 on the CI here??? This can't be right!

Could you please have a look? Would you like me to incorporate the cuda12 migration in here too?

ngam commented 1 year ago

@RaulPPelaez I will add you as a coauthor to commits in this PR if you don't mind. Please object if you don't want me to do that.

ngam commented 1 year ago

@conda-forge/pytorch-cpu @hmaarrfk @h-vetinari @Tobias-Fischer, since when are we able to build cuda112 on the CI here??? This can't be right!

Could you please have a look? Would you like me to incorporate the cuda12 migration in here too?

I should have checked the logs more carefully:

  -- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "11.2")
  CMake Warning at cmake/public/cuda.cmake:31 (message):
    Caffe2: CUDA cannot be found.  Depending on whether you are building Caffe2
    or a Caffe2 dependent library, the next warning / error will give you more
    info.
  Call Stack (most recent call first):
    cmake/Dependencies.cmake:43 (include)
    CMakeLists.txt:717 (include)

  CMake Warning at cmake/Dependencies.cmake:66 (message):
    Not compiling with CUDA.  Suppress this warning with -DUSE_CUDA=OFF.
  Call Stack (most recent call first):
    CMakeLists.txt:717 (include)
RaulPPelaez commented 1 year ago

The local build looks fine to me. The docker container appears to provide CUDA in /usr/local/cuda. I do not see that message about "CUDA not found". The only difference between your last commit and my local version is that I did not introduced this line in the CUDA branch in the build script:

export CMAKE_TOOLCHAIN_FILE="${RECIPE_DIR}/cross-linux.cmake"
Tobias-Fischer commented 1 year ago

I don’t know how to fix the problem, but we should check whether we can catch this somehow in a test case. It’s problematic that the build still passes.

RaulPPelaez commented 1 year ago

In my local build, adding the CMAKE_TOOLCHAIN_FILE makes CMake not pick CUDA up. By removing it the error goes away.

Not sure how to go about catching this particular mistake.

jamestwebber commented 1 year ago

What's the status of this? Confusingly for me, I see pytorch 2.0.1 when I run mamba search but the pyro-ppl feedstock is failing its test because it can't satisfy the requirement. I've been waiting on this PR before I update the feed to the latest version.

RaulPPelaez commented 1 year ago

The CUDA builds are giving us a hard time. For some reason CMake is not finding CUDA correctly and we ran out of ideas. Mysteriously a local build used to pass for me, let me see if the latest commit here still does and I will give it another spin.

Check out the channels in your env, in a base env I only see up to 2.0.0 in conda-forge:

$ mamba search pytorch
...
pytorch                       1.13.1 cuda112py39hb0b7ed5_200  conda-forge         
pytorch                        2.0.0 cpu_py310hd11e9c7_0  conda-forge         
pytorch                        2.0.0 cpu_py311h410fd25_0  conda-forge         
pytorch                        2.0.0 cpu_py38h019455c_0  conda-forge         
pytorch                        2.0.0 cpu_py39he4d1dc0_0  conda-forge         
pytorch                        2.0.0 cuda112py310he33e0d6_200  conda-forge         
pytorch                        2.0.0 cuda112py311h13fee9e_200  conda-forge         
pytorch                        2.0.0 cuda112py38h5e67e12_200  conda-forge         
pytorch                        2.0.0 cuda112py39ha9981d0_200  conda-forge   

Note that the pytorch channel offers 2.0.1.

hmaarrfk commented 1 year ago

@conda-forge-admin, please rerender

jamestwebber commented 1 year ago

Check out the channels in your env, in a base env I only see up to 2.0.0 in conda-forge:

Ah yeah I'm actually seeing it in pkgs/main, I should remove that channel...

Note that the pytorch channel offers 2.0.1.

Yeah it is definitely available there and I assume most people install it from that channel. But for the purposes of the conda-forge feedstock I like to make sure everything is available before updating. Those that need the latest can always use pip.

RaulPPelaez commented 1 year ago

By looking here: https://github.com/pytorch/pytorch/blob/0ab74044c2775970d3bc3668454a3152ae18ea82/.ci/docker/common/install_conda.sh#L54 It seems like a particular version of mkl is hardcoded This dockerfile is used to build with CUDA: https://github.com/pytorch/pytorch/blob/main/.ci/docker/ubuntu-cuda/Dockerfile If one follows the breadcrumb trail I believe the MKLDNN variable is set to 0. With these changes from the latest commit, a11a83c, a local build succeeds for me

modified   recipe/build_pytorch.sh
@@ -125,7 +125,7 @@ if [[ ${cuda_compiler_version} != "None" ]]; then
     export USE_STATIC_CUDNN=0
     export CUDA_TOOLKIT_ROOT_DIR=$CUDA_HOME
     export MAGMA_HOME="${PREFIX}"
-    export USE_MKLDNN=1
+    export USE_MKLDNN=0
 else
     if [[ "$target_platform" == *-64 ]]; then
       export BLAS="MKL"
modified   recipe/meta.yaml
@@ -79,8 +79,8 @@ outputs:
         - requests
         - future
         - six
-        - mkl-devel {{ mkl }}    # [x86]
-        - mkl-include {{ mkl }}  # [x86]
+        - mkl==2021.4.0          # [x86]
+        - mkl-include==2021.4.0  # [x86]
         - libcblas * *_mkl       # [x86]
         - libcblas               # [not x86]
         - liblapack              # [not x86]

The test then fails with:

import: 'torch'                                                                                                                                                              
Traceback (most recent call last):                                                                                                                                           
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1690209346820/test_tmp/run_test.py", line 2, in <module>                                                   
    import torch                                                                                                                                                             
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1690209346820/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.10/site-packages/torch/__init__.py", line 228, in <module>              
    _load_global_deps()                                                                                                                                                      
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1690209346820/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.10/site-packages/torch/__init__.py", line 187, in _load_global_deps     
    raise err                                                                                                                                                                
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1690209346820/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.10/site-packages/torch/__init__.py", line 168, in _load_global_deps     
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)                                                                                                                           
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1690209346820/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.10/ctypes/__init__.py", line 374, in __init__                           
    self._handle = _dlopen(self._name, mode)                                                                                                                                 
OSError: libmkl_intel_lp64.so.1: cannot open shared object file: No such file or directory 
ngam commented 1 year ago

See #176 for updates

shermansiu commented 11 months ago

176 was closed because the author didn't have enough time to "debug so much for a patch release."

RaulPPelaez commented 11 months ago

At this point I think that going directly for 2.1.0 https://github.com/pytorch/pytorch/releases/tag/v2.1.0

Alas the problems I had compiling this one are probably still there.

Looking at the build instructions, it seems like they are somewhat simpler than before, though: https://github.com/pytorch/pytorch#from-source

hmaarrfk commented 11 months ago

i'm not too sure what changed in 4 months but seems like something got fixed https://github.com/conda-forge/pytorch-cpu-feedstock/pull/199

working on getting through this.

CPU build passed, so now onto everything else....

jakirkham commented 10 months ago

Do we still want this given that PR ( https://github.com/conda-forge/pytorch-cpu-feedstock/pull/195 ) added 2.1.0?