Closed adyomin closed 3 years ago
Hey Andrei,
Thanks for your interest in TorchBeast and for all bug report.
It took me a while to interpret the cmake output here.
I think what happened is that cmake at some point had issues selecting the right Python version out of several, so I fixed it to 3.7. See this line in CMakeLists.txt
:
find_package(Python3 3.7 EXACT COMPONENTS Interpreter Development NumPy)
I believe if you were to remove the word EXACT
here it might work?
The suggested edit does indeed allow the build & installation to complete (with some warnings):
adyomin@DLW ~/s/torchbeast (master)> python setup.py install (nle_38)
running install
running bdist_egg
running egg_info
writing libtorchbeast.egg-info/PKG-INFO
writing dependency_links to libtorchbeast.egg-info/dependency_links.txt
writing requirements to libtorchbeast.egg-info/requires.txt
writing top-level names to libtorchbeast.egg-info/top_level.txt
reading manifest file 'libtorchbeast.egg-info/SOURCES.txt'
writing manifest file 'libtorchbeast.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
-- Found Python3: /home/adyomin/miniconda3/envs/nle_38/bin/python3.8 (found suitable version "3.8.10", minimum required is "3.7") found components: Interpreter Development NumPy Development.Module Development.Embed
-- pybind11 v2.6.2
--
-- 3.14.0.0
-- Caffe2: CUDA detected: 11.3
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.3
-- Found cuDNN: v8.2.1 (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 1ea278b5
-- Added CUDA NVCC flags for: -gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
CMake Warning at /home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
CMakeLists.txt:51 (find_package)
-- Configuring done
CMake Warning at /usr/share/cmake-3.18/Modules/FindPython/Support.cmake:3128 (add_library):
Cannot generate a safe runtime search path for target _C because there is a
cycle in the constraint graph:
dir 0 is [/home/adyomin/miniconda3/envs/nle/lib/python3.8/site-packages/torch/lib]
dir 1 is [/usr/local/cuda/lib64/stubs]
dir 2 is [/usr/local/cuda/lib64]
dir 3 must precede it due to runtime library [libnvToolsExt.so.1]
dir 3 is [/home/adyomin/miniconda3/envs/nle/lib]
dir 2 must precede it due to runtime library [libcudart.so.11.0]
dir 4 is [/home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages/torch/lib]
Some of these libraries may not be found correctly.
Call Stack (most recent call first):
/usr/share/cmake-3.18/Modules/FindPython3.cmake:393 (__Python3_add_library)
third_party/pybind11/tools/pybind11NewTools.cmake:196 (python3_add_library)
CMakeLists.txt:96 (pybind11_add_module)
-- Generating done
-- Build files have been written to: /home/adyomin/source/torchbeast/build/temp.linux-x86_64-3.8
[847/1318] Building CXX object grpc/third_party/protobuf/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/message_lite.cc.o
In file included from /usr/include/string.h:519,
from ../../third_party/grpc/third_party/protobuf/src/google/protobuf/stubs/port.h:39,
from ../../third_party/grpc/third_party/protobuf/src/google/protobuf/stubs/macros.h:34,
from ../../third_party/grpc/third_party/protobuf/src/google/protobuf/stubs/common.h:46,
from ../../third_party/grpc/third_party/protobuf/src/google/protobuf/message_lite.h:45,
from ../../third_party/grpc/third_party/protobuf/src/google/protobuf/message_lite.cc:36:
In function ‘void* memcpy(void*, const void*, size_t)’,
inlined from ‘google::protobuf::uint8* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, google::protobuf::uint8*)’ at ../../third_party/grpc/third_party/protobuf/src/google/protobuf/io/coded_stream.h:699:16,
inlined from ‘bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const’ at ../../third_party/grpc/third_party/protobuf/src/google/protobuf/implicit_weak_message.h:85:28:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:33: warning: ‘void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)’ specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
29 | return __builtin___memcpy_chk (__dest, __src, __len,
| ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
30 | __glibc_objsize0 (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
[1318/1318] Linking CXX shared module ../lib.linux-x86_64-3.8/libtorchbeast/_C.cpython-38-x86_64-linux-gnu.so
[0/1] Install the project...
-- Install configuration: "Release"
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/libtorchbeast
copying build/lib.linux-x86_64-3.8/libtorchbeast/_C.cpython-38-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/libtorchbeast
copying build/lib.linux-x86_64-3.8/libtorchbeast/__init__.py -> build/bdist.linux-x86_64/egg/libtorchbeast
byte-compiling build/bdist.linux-x86_64/egg/libtorchbeast/__init__.py to __init__.cpython-38.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying libtorchbeast.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
creating 'dist/libtorchbeast-0.0.20-py3.8-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing libtorchbeast-0.0.20-py3.8-linux-x86_64.egg
creating /home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages/libtorchbeast-0.0.20-py3.8-linux-x86_64.egg
Extracting libtorchbeast-0.0.20-py3.8-linux-x86_64.egg to /home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages
Adding libtorchbeast 0.0.20 to easy-install.pth file
Installed /home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages/libtorchbeast-0.0.20-py3.8-linux-x86_64.egg
Processing dependencies for libtorchbeast==0.0.20
Searching for torch==1.9.0
Best match: torch 1.9.0
Adding torch 1.9.0 to easy-install.pth file
Installing convert-caffe2-to-onnx script to /home/adyomin/miniconda3/envs/nle_38/bin
Installing convert-onnx-to-caffe2 script to /home/adyomin/miniconda3/envs/nle_38/bin
Using /home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages
Searching for typing-extensions==3.7.4.3
Best match: typing-extensions 3.7.4.3
Adding typing-extensions 3.7.4.3 to easy-install.pth file
Using /home/adyomin/miniconda3/envs/nle_38/lib/python3.8/site-packages
Finished processing dependencies for libtorchbeast==0.0.20
Great!
How about we change your PR to just remove EXACT
then and see if that causes further problems down the line (which I think it will, for people with several Python versions installed ...)
Thanks, merged.
I was getting similar error on MacOS OSX 11.4 big sur, with conda. Upgrading to python3.8 fixed it.
(nethack) 88665a14b754:torchbeast maxreede$ python setup.py install
running install
running bdist_egg
running egg_info
writing libtorchbeast.egg-info/PKG-INFO
writing dependency_links to libtorchbeast.egg-info/dependency_links.txt
writing requirements to libtorchbeast.egg-info/requires.txt
writing top-level names to libtorchbeast.egg-info/top_level.txt
reading manifest file 'libtorchbeast.egg-info/SOURCES.txt'
writing manifest file 'libtorchbeast.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.7-x86_64/egg
running install_lib
running build_py
running build_ext
-- Could NOT find Python3 (missing: Python3_NumPy_INCLUDE_DIRS NumPy) (found suitable version "3.9.4", minimum required is "3.7")
-- pybind11 v2.6.2
--
-- 3.14.0.0
CMake Deprecation Warning at third_party/grpc/third_party/zlib/CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Deprecation Warning at third_party/grpc/third_party/googletest/CMakeLists.txt:4 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Deprecation Warning at third_party/grpc/third_party/googletest/googlemock/CMakeLists.txt:45 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Deprecation Warning at third_party/grpc/third_party/googletest/googletest/CMakeLists.txt:56 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
/Volumes/workplace/ml/aicrowd/torchbeast/_Python3_NumPy_INCLUDE_DIR
used as include directory in directory /Volumes/workplace/ml/aicrowd/torchbeast
CMake Error at third_party/pybind11/tools/pybind11Tools.cmake:166 (add_library):
Target "_C" links to target "Python3::NumPy" but the target was not found.
Perhaps a find_package() call is missing for an IMPORTED target, or an
ALIAS target is missing?
Call Stack (most recent call first):
CMakeLists.txt:96 (pybind11_add_module)
CMake Warning (dev):
Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake
--help-policy CMP0042" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
MACOSX_RPATH is not specified for the following targets:
zlib
This warning is for project developers. Use -Wno-dev to suppress it.
-- Generating done
CMake Generate step failed. Build files cannot be regenerated correctly.
Hey Max,
Thanks for your comment.
It's always a bit hard to parse these CMake errors, but I believe what was missing in your Python 3.9 installation was Numpy. Perhaps there is some issue where CMake has a harder time finding Numpy for Python 3.9, or perhaps it was just not installed in your case?
Don't worry it is working now :) But the trick was that my local conda python was 3.6, so cmake somehow found my global system python (homebrew upgraded it to 3.9) but my system python does not have numpy. So a bit of a red herring.
Upgrading my local conda python to python=3.8 fixed the issue.
On Mon, Jul 5, 2021 at 1:41 AM Heinrich Kuttler @.***> wrote:
Hey Max,
Thanks for your comment.
It's always a bit hard to parse these CMake errors, but I believe what was missing in your Python 3.9 installation was Numpy. Perhaps there is some issue where CMake has a harder time finding Numpy for Python 3.9, or perhaps it was just not installed in your case?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/torchbeast/issues/30#issuecomment-873923667, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7KYTT3CKYKWBRPWQJTS4DTWFVZ7ANCNFSM465M7QHQ .
apt install python3.8-dev
I tired following the NetHack challenge baseline setup instructions using Python 3.8 as suggested. I could not build PolyBeast.
Steps to reproduce the issue (except for repo and sub-modules cloning):
Output of the
python collect_env.py
in the same conda environment: