k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.
https://k2-fsa.github.io/k2
Apache License 2.0
1.12k stars 213 forks source link

pybind11 downloading error #1168

Open zjwang21 opened 1 year ago

zjwang21 commented 1 year ago

Must the installation download pybind11 use the network? I try to install k2 from source in an environment without networks. Appreciate your suggestions.

-- Found Git: /usr/bin/git (found version "1.8.3.1") -- Looking for C++ include cxxabi.h -- Looking for C++ include cxxabi.h - found -- Looking for C++ include execinfo.h -- Looking for C++ include execinfo.h - found -- Performing Test K2_COMPILER_SUPPORTS_CXX14 -- Performing Test K2_COMPILER_SUPPORTS_CXX14 - Success -- C++ Standard version: 14 -- Could NOT find Valgrind (missing: Valgrind_INCLUDE_DIR Valgrind_EXECUTABLE) -- Downloading pybind11 from https://github.com/pybind/pybind11/archive/5bc0943ed96836f46489f53961f6c438d2935357.zip [ 11%] Performing download step (download, verify and extract) for 'pybind11-populate' -- verifying file... file='/mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/_deps/pybind11-subbuild/pybind11-populate-prefix/src/5bc0943ed96836f46489f53961f6c438d2935357.zip' -- SHA256 hash of /mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/_deps/pybind11-subbuild/pybind11-populate-prefix/src/5bc0943ed96836f46489f53961f6c438d2935357.zip does not match expected value expected: 'ff65a1a8c9e6ceec11e7ed9d296f2e22a63e9ff0c4264b3af29c72b4f18f25a0' actual: 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' -- File already exists but hash mismatch. Removing... -- Downloading...

csukuangfj commented 1 year ago

please put pybind11 to the directory /tmp.

The filename should be

/tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip

The implementation details are in https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake


You don't need to access the network when installing k2, provided you have downloaded all the cmake dependencies required by k2.

zjwang21 commented 1 year ago

please put pybind11 to the directory /tmp.

The filename should be

/tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip

The implementation details are in https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake

You don't need to access the network when installing k2, provided you have downloaded all the cmake dependencies required by k2.

Thanks, that works. but then i got this error, I have cuda-11.6 for which nvcc

`running install /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. setuptools.SetuptoolsDeprecationWarning, /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/easy_install.py:159: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. EasyInstallDeprecationWarning, /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/pkg_resources/init.py:119: PkgResourcesDeprecationWarning: 0.6.7.develop.2021-11-19t01-51.4c56aa7b is an invalid version and will not be supported in a future release PkgResourcesDeprecationWarning, running bdist_egg running egg_info writing manifest file 'k2.egg-info/SOURCES.txt' running install_lib running build_py copying k2/python/k2/init.py -> build/lib.linux-x86_64-3.7/k2 running build_ext cmake_path: /mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/cmake Setting PYTHON_EXECUTABLE to /mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/python3 build command is:

            cd build/temp.linux-x86_64-3.7

            cmake -DK2_WITH_CUDA=OFF -DPYTHON_EXECUTABLE=/mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/python3 -DK2_ENABLE_BENCHMARK=OFF  -DK2_ENABLE_TESTS=OFF  -DCMAKE_INSTALL_PREFIX=/mnt/lustre/wangzhijun2/speech/k2/build/lib.linux-x86_64-3.7/k2 -D CMAKE_CUDA_COMPILER=$(which nvcc) /mnt/lustre/wangzhijun2/speech/k2

            cat k2/csrc/version.h

            make -j6 install

-- CMAKE_VERSION: 3.25.2 -- Disable CUDA support -- Enabled languages: CXX -- No CMAKE_BUILD_TYPE given, default to Release -- Set K2_ENABLE_NVTX to OFF since K2_WITH_CUDA is OFF -- K2_OS: CentOS Linux release 7.4.1708 (Core) -- C++ Standard version: 14 -- Could NOT find Valgrind (missing: Valgrind_INCLUDE_DIR Valgrind_EXECUTABLE) -- Downloading pybind11 from file:///tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip -- pybind11 is downloaded to /mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/_deps/pybind11-src -- pybind11 v2.11.0 dev1 -- Python executable: /mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/python3 CMake Error at /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:739 (message): Compiling the CUDA compiler identification source file "CMakeCUDACompilerId.cu" failed.

Compiler: /mnt/cache/wangzhijun2/cuda-11.6/bin/nvcc

Build flags:

Id flags: --keep;--keep-dir;tmp -v

The output was:

1

$ _NVVMBRANCH=nvvm

$ SPACE=

$ CUDART=cudart

$ HERE=/mnt/cache/wangzhijun2/cuda-11.6/bin

$ THERE=/mnt/cache/wangzhijun2/cuda-11.6/bin

$ _TARGETSIZE=

$ _TARGETDIR=

$ _TARGETDIR=targets/x86_64-linux

$ TOP=/mnt/cache/wangzhijun2/cuda-11.6/bin/..

$

NVVMIR_LIBRARY_DIR=/mnt/cache/wangzhijun2/cuda-11.6/bin/../nvvm/libdevice

$

LD_LIBRARY_PATH=/mnt/cache/wangzhijun2/cuda-11.6/bin/../lib:/mnt/cache/wangzhijun2/cuda-11.6/lib64:/usr/local/cuda/lib:/usr/local/cuda/lib64:/mnt/lustre/share/cuda-10.1/lib64:

$

PATH=/mnt/cache/wangzhijun2/cuda-11.6/bin/../nvvm/bin:/mnt/cache/wangzhijun2/cuda-11.6/bin:/mnt/cache/share/gcc/gcc-7.3.0/bin:/mnt/cache/wangzhijun2/cuda-11.6/bin:/mnt/lustre/wangzhijun2/.local/bin:/mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin:/mnt/cache/wangzhijun2/anaconda3/condabin:/mnt/lustre/wangzhijun2/.vscode-server/bin/5235c6bb189b60b01b1f49062f4ffa42384f8c91/bin/remote-cli:/mnt/cache/share/intel64/bin:/mnt/cache/share/spring:/mnt/lustre/share/kaldi/src/fstbin:/mnt/cache/share/platform/env:/usr/local/cuda/bin:/mnt/lustre/share/cuda-10.1/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/opt/puppetlabs/bin

$

INCLUDES="-I/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/include"

$ LIBRARIES=

"-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib/stubs" "-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib"

$ CUDAFE_FLAGS=

$ PTXAS_FLAGS=

$ rm tmp/a_dlink.reg.c

$ gcc -D__CUDA_ARCH=520 -DCUDA_ARCH_LIST__=520 -E -x c++

-DCUDA_DOUBLE_MATH_FUNCTIONS -DCUDACC -DNVCC "-I/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/include" -DCUDACC_VER_MAJOR=11 -DCUDACC_VER_MINOR=6 -DCUDACC_VER_BUILD=124 -DCUDA_API_VER_MAJOR=11 -DCUDA_API_VER_MINOR=6 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp1.ii"

$ cicc --c++14 --gnu_version=70300 --display_error_number

--orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name "/mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/CMakeFiles/3.25.2/CompilerIdCUDA/CMakeCUDACompilerId.cu" --allow_managed -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name "CMakeCUDACompilerId.fatbin.c" -tused --gen_module_id_file --module_id_file_name "tmp/CMakeCUDACompilerId.module_id" --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.c" --stub_file_name "tmp/CMakeCUDACompilerId.cudafe1.stub.c" --gen_device_file_name "tmp/CMakeCUDACompilerId.cudafe1.gpu" "tmp/CMakeCUDACompilerId.cpp1.ii" -o "tmp/CMakeCUDACompilerId.ptx"

$ ptxas -arch=sm_52 -m64 "tmp/CMakeCUDACompilerId.ptx" -o

"tmp/CMakeCUDACompilerId.sm_52.cubin"

$ fatbinary --create="tmp/CMakeCUDACompilerId.fatbin" -64

--cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 " "--image3=kind=elf,sm=52,file=tmp/CMakeCUDACompilerId.sm_52.cubin" "--image3=kind=ptx,sm=52,file=tmp/CMakeCUDACompilerId.ptx" --embedded-fatbin="tmp/CMakeCUDACompilerId.fatbin.c"

$ gcc -DCUDA_ARCH_LIST=520 -E -x c++ -DCUDACC -DNVCC

"-I/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/include" -DCUDACC_VER_MAJOR=11 -DCUDACC_VER_MINOR=6 -DCUDACC_VER_BUILD=124 -DCUDA_API_VER_MAJOR=11 -DCUDA_API_VER_MINOR=6 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp4.ii"

$ cudafe++ --c++14 --gnu_version=70300 --display_error_number

--orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name "/mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/CMakeFiles/3.25.2/CompilerIdCUDA/CMakeCUDACompilerId.cu" --allow_managed --m64 --parse_templates --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name "CMakeCUDACompilerId.cudafe1.stub.c" --module_id_file_name "tmp/CMakeCUDACompilerId.module_id" "tmp/CMakeCUDACompilerId.cpp4.ii"

$ gcc -D__CUDA_ARCH=520 -DCUDA_ARCH_LIST__=520 -c -x c++

-DCUDA_DOUBLE_MATH_FUNCTIONS "-I/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/include" -m64 "tmp/CMakeCUDACompilerId.cudafe1.cpp" -o "tmp/CMakeCUDACompilerId.o"

$ nvlink -m64 --arch=sm_52 --register-link-binaries="tmp/a_dlink.reg.c"

"-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib/stubs" "-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib" -cpu-arch=X86_64 "tmp/CMakeCUDACompilerId.o" -lcudadevrt -o "tmp/a_dlink.sm_52.cubin"

$ fatbinary --create="tmp/a_dlink.fatbin" -64 --cicc-cmdline="-ftz=0

-prec_div=1 -prec_sqrt=1 -fmad=1 " -link "--image3=kind=elf,sm=52,file=tmp/a_dlink.sm_52.cubin" --embedded-fatbin="tmp/a_dlink.fatbin.c"

$ gcc -DCUDA_ARCH_LIST=520 -c -x c++

-DFATBINFILE="\"tmp/a_dlink.fatbin.c\"" -DREGISTERLINKBINARYFILE="\"tmp/a_dlink.reg.c\"" -I. -DNV_EXTRA_INITIALIZATION= -DNV_EXTRA_FINALIZATION= -DCUDA_INCLUDE_COMPILER_INTERNAL_HEADERS "-I/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/include" -DCUDACC_VER_MAJOR=11 -DCUDACC_VER_MINOR=6 -DCUDACC_VER_BUILD=124 -DCUDA_API_VER_MAJOR=11 -DCUDA_API_VER_MINOR=6 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -m64 "/mnt/cache/wangzhijun2/cuda-11.6/bin/crt/link.stub" -o "tmp/a_dlink.o"

$ g++ -DCUDA_ARCH_LIST=520 -m64 -Wl,--start-group "tmp/a_dlink.o"

"tmp/CMakeCUDACompilerId.o" "-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib/stubs" "-L/mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib" -lcudadevrt -lcudart_static -lrt -lpthread -ldl -Wl,--end-group -o "a.out"

/usr/bin/ld: /mnt/cache/wangzhijun2/cuda-11.6/bin/../targets/x86_64-linux/lib/libcudart_static.a(cudart_static.o): unrecognized relocation (0x2a) in section `.text'

/usr/bin/ld: final link failed: 错误的值

collect2: 错误:ld 返回 1

--error 0x1 --

Call Stack (most recent call first): /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD) /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:48 (__determine_compiler_id_test) /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID) /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:47 (enable_language) /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:92 (include) /mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) cmake/torch.cmake:11 (find_package) CMakeLists.txt:292 (include)

-- Configuring incomplete, errors occurred! See also "/mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/CMakeFiles/CMakeOutput.log". See also "/mnt/lustre/wangzhijun2/speech/k2/build/temp.linux-x86_64-3.7/CMakeFiles/CMakeError.log". cat: k2/csrc/version.h: 没有那个文件或目录 make: * 没有规则可以创建目标“install”。 停止。 Traceback (most recent call last): File "setup.py", line 264, in "Topic :: Scientific/Engineering :: Artificial Intelligence", File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(attrs) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 148, in setup return run_commands(dist) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 163, in run_commands dist.run_commands() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands self.run_command(cmd) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/install.py", line 74, in run self.do_egg_install() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/install.py", line 116, in do_egg_install self.run_command('bdist_egg') File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 164, in run cmd = self.call_command('install_lib', warn_dir=0) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command self.run_command(cmdname) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/install_lib.py", line 11, in run self.build() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/command/install_lib.py", line 107, in build self.run_command('build_ext') File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 986, in run_command cmd_obj.run() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/mnt/cache/wangzhijun2/anaconda3/envs/wzj/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "setup.py", line 208, in build_extension raise Exception("Failed to build k2") Exception: Failed to build k2`

zjwang21 commented 1 year ago

Besides, I tried to install through pip. It works but when I check the version, this error occurs, I wonder if there are more easier ways to fix the installation error:

ModuleNotFoundError: No module named '_k2'

csukuangfj commented 1 year ago
        cmake -DK2_WITH_CUDA=OFF -DPYTHON_EXECUTABLE=/mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/python3 -DK2_ENABLE_BENCHMARK=OFF  -DK2_ENABLE_TESTS=OFF  -DCMAKE_INSTALL_PREFIX=/mnt/lustre/wangzhijun2/speech/k2/build/lib.linux-x86_64-3.7/k2 -D CMAKE_CUDA_COMPILER=$(which nvcc) /mnt/lustre/wangzhijun2/speech/k2

Why there is -DK2_WITH_CUDA=OFF in your error log since you want to build a CUDA version of k2?

Could you describe which document are you following to install k2?

zjwang21 commented 1 year ago
        cmake -DK2_WITH_CUDA=OFF -DPYTHON_EXECUTABLE=/mnt/cache/wangzhijun2/anaconda3/envs/wzj/bin/python3 -DK2_ENABLE_BENCHMARK=OFF  -DK2_ENABLE_TESTS=OFF  -DCMAKE_INSTALL_PREFIX=/mnt/lustre/wangzhijun2/speech/k2/build/lib.linux-x86_64-3.7/k2 -D CMAKE_CUDA_COMPILER=$(which nvcc) /mnt/lustre/wangzhijun2/speech/k2

Why there is -DK2_WITH_CUDA=OFF in your error log since you want to build a CUDA version of k2?

Could you describe which document are you following to install k2?

I am following instructions here: https://k2-fsa.github.io/k2/installation/from_source.html. And I want to install without cuda (cpu only).

zjwang21 commented 1 year ago

I follow the instructions to install cpu-only pytorch and this error disappear. But another occurs,

[ 78%] Built target k2_torch_api [ 78%] Building CXX object k2/torch/bin/CMakeFiles/ctc_decode.dir/ctc_decode.cc.o [ 80%] Building CXX object k2/python/csrc/CMakeFiles/_k2.dir/torch/ragged.cc.o [ 81%] Linking CXX executable ../../../bin/ctc_decode /usr/bin/ld: warning: libprotobuf.so.32, needed by /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libmkl_intel_lp64.so.2, needed by /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libmkl_gnu_thread.so.2, needed by /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libmkl_core.so.2, needed by /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libsleef.so.3, needed by /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so, not found (try using -rpath or -rpath-link) /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘vdLn’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘vmsErf’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘mkl_sparse_d_trsm’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘Sleef_powf8_u10’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘vmdLn’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘DftiComputeForward’未定义的引用

Thank you very much for your patience

csukuangfj commented 1 year ago

are you able to find libmkl_intel.so* inside

/mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/

find /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/ -name libmkl_intel.so*

If yes, please use

export LD_LIBRARY_PATH=/mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/:$LD_LIBRARY_PATH

I follow the instructions to install cpu-only pytorch and this error disappear

If you don't want to install a CUDA version of k2, then you only need to install a CPU version of PyTorch and there is no need to install cudatoolkit.

zjwang21 commented 1 year ago

'find /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/ -name libmkl_intel.so*' returns nothing but has:

/mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_lp64.so /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_thread.so.2 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_ilp64.so /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_ilp64.so.2 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_thread.so /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/libmkl_intel_lp64.so.2

I tried to export LD_LIBRARY_PATH=/mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/:$LD_LIBRARY_PATH, the previous error disappears, but another occurs, [ 91%] Built target k2_torch_api [ 91%] Building CXX object k2/python/csrc/CMakeFiles/_k2.dir/torch/v2/ragged_shape.cc.o [ 92%] Linking CXX executable ../../../bin/ctc_decode /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::runtime_error::runtime_error(std::runtime_error&&)@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::runtime_error::runtime_error(std::runtime_error&&)@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::__throw_bad_array_new_length()@GLIBCXX_3.4.29’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13’未定义的引用 collect2: 错误:ld 返回 1 make[2]: *** [bin/ctc_decode] 错误 1 make[1]: *** [k2/torch/bin/CMakeFiles/ctc_decode.dir/all] 错误 2 make[1]: *** 正在等待未完成的任务.... [ 93%] Linking CXX executable ../../../bin/hlg_decode /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::runtime_error::runtime_error(std::runtime_error&&)@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::runtime_error::runtime_error(std::runtime_error&&)@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::__throw_bad_array_new_length()@GLIBCXX_3.4.29’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libc10.so:对‘std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26’未定义的引用 /mnt/cache/wangzhijun2/anaconda3/envs/speech/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so:对‘std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13’未定义的引用 collect2: 错误:ld 返回 1 make[2]: *** [bin/hlg_decode] 错误 1 make[1]: *** [k2/torch/bin/CMakeFiles/hlg_decode.dir/all] 错误 2

csukuangfj commented 1 year ago

Are you using conda install to install PyTorch?

If yes, could you please uninstall it and reinstall with pip install?

conda does lots of things behind you and almost all users having troubles with k2 are using conda.

zjwang21 commented 1 year ago

Saved my day! Reinstall through pip works

season95 commented 1 year ago

please put pybind11 to the directory /tmp.

The filename should be

/tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip

The implementation details are in https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake

You don't need to access the network when installing k2, provided you have downloaded all the cmake dependencies required by k2.

excuse me, i have the same problem above, and i run "pip install pybind11", move it to /tmp and change the SHA256 hash in file https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake in line 25. However, new problem exists in line 46-47, which says: " [11%] performing patch step for 'pybind11-populate' sed: 无法读取 tools/pybind11Tools.cmake: 没有那个文件或目录 " should i find this file somewhere or i did something wrong?

csukuangfj commented 1 year ago

Are you using the latest master?

You don't need to run

pip install pybind11

csukuangfj commented 1 year ago

@season95

If you really want to download pybind11, please have a look at https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake

The expected locations are https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L30-L36

You can add more possible locations if you like.


The download addresses are listed at https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L24-L25

season95 commented 1 year ago

@season95

If you really want to download pybind11, please have a look at https://github.com/k2-fsa/k2/blob/master/cmake/pybind11.cmake

The expected locations are

https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L30-L36

You can add more possible locations if you like.

The download addresses are listed at

https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L24-L25

i download the latest release 1.23.4, and in my environment i have no access to git either. i do copy file to /tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip, but it doesn't work somehow, what else can i try o check else please?

csukuangfj commented 1 year ago

i download the latest release 1.23.4,

Please try the latest master.


i do copy file to /tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip, but it doesn't work somehow, what else can i try o check else please?

What is the output of

sha256sum /tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip
season95 commented 1 year ago

i download the latest release 1.23.4,

Please try the latest master.

i do copy file to /tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip, but it doesn't work somehow, what else can i try o check else please?

What is the output of

sha256sum /tmp/pybind11-5bc0943ed96836f46489f53961f6c438d2935357.zip

its sha256 is 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', should i download the latest pybind11 zip in "https://github.com/pybind/pybind11/archive/5bc0943ed96836f46489f53961f6c438d2935357.zip"?

csukuangfj commented 1 year ago

@season95

should i download the latest pybind11 zip

No, please don't do that.

As I have said in https://github.com/k2-fsa/k2/issues/1168#issuecomment-1471210218

Please download pybind11 using the following URLs https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L24-L25

The correct sha256 hash sum is https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L26

But yours is

its sha256 is 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
season95 commented 1 year ago

@season95

should i download the latest pybind11 zip

No, please don't do that.

As I have said in #1168 (comment)

Please download pybind11 using the following URLs

https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L24-L25

The correct sha256 hash sum is

https://github.com/k2-fsa/k2/blob/88c1ad0ab70691a09f061fce74489bdbe0ce73d4/cmake/pybind11.cmake#L26

But yours is

its sha256 is 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'

thanks. after i download the pybind, i found that there are more pkgs like cub, modernpu to be download.T^T. And my env doesn't have access to github. so i will try slowly and see what new problem is waiting for me.

csukuangfj commented 1 year ago

Please have a look at the cmake/ directory.

You can do similar things to other deps like pybind11.

csukuangfj commented 1 year ago

https://huggingface.co/csukuangfj/k2-cmake-deps/tree/main

You can find all the tar.gz files in the above repo.

season95 commented 1 year ago

You can find all the tar.gz files in the above repo.

thanks a lot for your detailed help, i will try it now.

season95 commented 1 year ago

https://huggingface.co/csukuangfj/k2-cmake-deps/tree/main

You can find all the tar.gz files in the above repo.

excuse me, a new problem just exists. """ In file included from /opt/lib/cuda-8.0/include/cuda_runtime.h:78:0, from :0 /opt/lib/cuda-8.0/include/host_config.h:115:2: error: #error -- unsupported GNU version! gcc versions later than 5.3 are not supported! """

I am using cuda10.2 and the result of 'nvcc -V' is 'Cuda compilation tools release 10.2'.(multi CUDA version was installed before, and i choose cuda10.2 by setting it to LD_LIBRARY_PATH and PATH). But why it takes the path of cuda8.0?

csukuangfj commented 1 year ago

Which version of gcc are you using?

season95 commented 1 year ago

i use gcc version 7.3.0

Which version of gcc are you using?

csukuangfj commented 1 year ago

https://k2-fsa.github.io/k2/installation/cuda-cudnn.html

If you have multiple CUDA versions, please have a look at the above URL.

csukuangfj commented 1 year ago

https://k2-fsa.github.io/k2/installation/faqs.html#error-c-14-or-later-compatible-compiler-is-required-to-use-aten

If you have multiple versions of gcc , please have a look at the above URL.

The error shows you are using a very new gcc , which is not supported by your current cudnn.

season95 commented 1 year ago

https://k2-fsa.github.io/k2/installation/faqs.html#error-c-14-or-later-compatible-compiler-is-required-to-use-aten

If you have multiple versions of gcc , please have a look at the above URL.

The error shows you are using a very new gcc , which is not supported by your current cudnn.

thanks for your help, i already searched in the issues and tried the environment variables settings for gcc and cuda following the URL above. but the same problem still there. i use cuda10.2 and gcc7.3.0, which may be matched. i am confused that why the info of error shows the path of cuda8? like:

In file included from /opt/lib/cuda-8.0/include/cuda_runtime.h:78:0, from :0 /opt/lib/cuda-8.0/include/host_config.h:115:2: error: #error -- unsupported GNU version! gcc versions later than 5.3 are not > > supported!

csukuangfj commented 1 year ago

i am confused that why the info of error shows the path of cuda8? like:

You have installed multiple versions of CUDA toolkit. However, you don't set up environment variables or use incorrect environment variables to tell cmake which CUDA toolkit to choose.

Please follow our documentation to set up the environment variables, delete the build directory and re-try.

season95 commented 1 year ago

i am confused that why the info of error shows the path of cuda8? like:

You have installed multiple versions of CUDA toolkit. However, you don't set up environment variables or use incorrect environment variables to tell cmake which CUDA toolkit to choose.

Please follow our documentation to set up the environment variables, delete the build directory and re-try.

maybe i did not describe clearly, actually, some lines of logs are as follows: """ Found CUDA: /opt/lib/cuda-10.2 (found version "10.2") Caffe2: CUDA detected: 10.2 Caffe2: CUDA nvcc is: /opt/lib/cuda-10.2/bin/nvcc Caffe2: CUDA toolkit directory: /opt/lib/cuda-10.2 ... ... [32%]Building CUDA Object k2/csrc/CMakeFiles/context.dir/dtype.cu.o In file included from /opt/lib/cuda-8.0/include/cuda_runtime.h:78:0, from :0 /opt/lib/cuda-8.0/include/host_config.h:115:2: error: #error -- unsupported GNU version! gcc versions later than 5.3 are not supported! """ what confused me is that the make-log shows two version of CUDA, first 10.2 then 8.0. I also tried to delete cuda-8.0 in PATH and LD_LIBRARY_PATH, but the log info of cuda-8.0 still exists. Do you have any idea about that?

csukuangfj commented 1 year ago

Have you deleted the build directory before you continue?

season95 commented 1 year ago

Have you deleted the build directory before you continue?

hi,i print out all env variables and check cuda-8.0 yesterday. After i delete cuda-8.0 from the variable LOADEDMODULES, everything is ok. Now k2 is installed and python3 -m k2.version is ok. But i am still not very clear about the base cause. Anyway, thanks for helping.

csukuangfj commented 1 year ago

@season95 Glad to hear that you have managed to install k2 at the end.