k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.
https://k2-fsa.github.io/k2
Apache License 2.0
1.11k stars 213 forks source link

Cmake issues #225

Closed danpovey closed 3 years ago

danpovey commented 3 years ago

@csukuangfj what is the recommended way to tell CMake how to pick up a virtual Python environment? We are having a problem on a new environment in Xiaomi.

csukuangfj commented 3 years ago

We are currently relying on Pybind11 to select the python environment. By default, it will select the highest version available on PATH.

According to the Pybind11 documentation:

https://pybind11.readthedocs.io/en/stable/compiling.html

The target Python version can be selected by setting PYBIND11_PYTHON_VERSION or an exact Python installation can be specified with PYTHON_EXECUTABLE. For example:

cmake -DPYBIND11_PYTHON_VERSION=3.6 ..
# or
cmake -DPYTHON_EXECUTABLE=path/to/python ..
csukuangfj commented 3 years ago

If you have multiple virtual environments with the same python version, I would suggest

cmake -DPYTHON_EXECUTABLE=path/to/python ..
csukuangfj commented 3 years ago

Does it solve your problem?

danpovey commented 3 years ago

Thanks! Let's wait to hear from Haowen.

On Tue, Oct 6, 2020 at 1:43 PM Fangjun Kuang notifications@github.com wrote:

Does it solve your problem?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/k2/issues/225#issuecomment-704040903, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOZ4QBYSVGKPSO3OST3SJKU6RANCNFSM4SFRMUKQ .

qindazhu commented 3 years ago

Yeah, as I said in WeChat, I used cmake -DPYTHON_EXECUTABLE= /ceph-hw/env/py-k2/bin/python .. and can run cmake successfully without error. However, when I run make, I will get error

/ceph-hw/k2/build/_deps/pybind11-src/include/pybind11/detail/common.h:112:10: fatal error: Python.h: No such file or directory
 #include <Python.h>

Then I installed python3.8-dev and fixed the error above. But there's still another error now

/usr/bin/ld: cannot find -lCUDA_cublas_LIBRARY-NOTFOUND
collect2: error: ld returned 1 exit status
k2/csrc/CMakeFiles/context.dir/build.make:372: recipe for target 'lib/libcontext.so' failed
make[2]: *** [lib/libcontext.so] Error 1
CMakeFiles/Makefile2:587: recipe for target 'k2/csrc/CMakeFiles/context.dir/all' failed
csukuangfj commented 3 years ago

Could you share the log info of the FIRST run of cd build; rm -rf ./*; cmake ..

qindazhu commented 3 years ago

just to make sure, do you mean run cmake without setting PYTHON_EXECUTABLE? just cmake..

csukuangfj commented 3 years ago

cmake -DPYTHON_EXECUTABLE= /ceph-hw/env/py-k2/bin/python ..

qindazhu commented 3 years ago

OK, here it is, thanks!

-- No CMAKE_BUILD_TYPE given, default to Debug
-- found CUDAToolkit /usr/local/cuda/lib64
-- CUDAToolkit_INCLUDE_DIRS /usr/local/cuda/include
-- CUDAToolkit_LIBRARY_DIR /usr/local/cuda/lib64
-- Downloading pybind11
-- pybind11 is downloaded to /ceph-hw/k2/build/_deps/pybind11-src
-- Found PythonInterp: /ceph-hw/env/py-k2/bin/python (found version "3.8") 
-- Found PythonLibs: python3.8
-- pybind11 v2.5.0
-- PYTHON_EXECUTABLE: /ceph-hw/env/py-k2/bin/python
-- Found CUDA: /usr/local/cuda (found version "10.1") 
-- Caffe2: CUDA detected: 10.1
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 10.1
-- Found CUDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so  
-- Found cuDNN: v7.6.5  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Autodetected CUDA architecture(s):  7.0 7.0 7.0 7.0
-- Added CUDA NVCC flags for: -gencode;arch=compute_70,code=sm_70
-- Found Torch: /ceph-hw/env/py-k2/lib/python3.8/site-packages/torch/lib/libtorch.so  
-- Downloading cub
-- cub is downloaded to /ceph-hw/k2/build/_deps/cub-src
-- Downloading moderngpu
-- moderngpu is downloaded to /ceph-hw/k2/build/_deps/moderngpu-src
-- Downloading googletest
-- googletest is downloaded to /ceph-hw/k2/build/_deps/googletest-src
-- googletest's binary dir is /ceph-hw/k2/build/_deps/googletest-build
-- The C compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Failed
-- LTO disabled (not supported by the compiler and/or linker)
-- Configuring done
-- Generating done
-- Build files have been written to: /ceph-hw/k2/build
csukuangfj commented 3 years ago

There are probably some problems in the CUDA installation.

As a workaround, I recommend you to add

-DCUDA_cublas_LIBRARY=/usr/local/cuda/lib64/libcublas.so

or

-DCUDA_cublas_LIBRARY=/usr/lib/x86_64-linux-gnu/libcublas.so

depending on where libcublas.so is while invoking cmake ..


The cudnn library is in /usr/lib/x86_64-linux-gnu/libcudnn.so. I guess it was installed from apt-get. You can try to install CUDAToolkit and cudnn in your home directory.

qindazhu commented 3 years ago

yeah, many thanks!

csukuangfj commented 3 years ago

Is the problem fixed?

qindazhu commented 3 years ago

We have issues with Cuda installation (which is not installed by me), may try to install, will come back to you later, many thanks!

qindazhu commented 3 years ago

Fixed, many thanks, @csukuangfj

csukuangfj commented 3 years ago

Great!

Sent from myMail for iOS

Tuesday, 6 October 2020, 15:33 +0800 from notifications@github.com notifications@github.com:

Fixed, many thanks, @csukuangfj — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or unsubscribe .

danpovey commented 3 years ago

It seems the way we are finding CUDA does not work for newer version of the CUBLAS library. After 10.0, e.g. with 10.1, they moved the CUBLAS library to "outside" the main toolkit. Even after I installed cuda-cublas-dev-10-0, I'm getting -lCUDA_cublas_LIBRARY-NOTFOUND in the cmake output.

csukuangfj commented 3 years ago

It seems the way we are finding CUDA does not work for newer version of the CUBLAS library. After 10.0, e.g. with 10.1,

The colab notebook is using cuda 10.1; however, there is no problem to compile k2 in colab.

I am using cuda 10.1 locally, though it is not installed via apt-get.

danpovey commented 3 years ago

It may depend how it was installed, e.g. from apt-get vs. the installer. Where is the cublas library, for you? With the toolkit?

On Tue, Oct 6, 2020 at 8:04 PM Fangjun Kuang notifications@github.com wrote:

It seems the way we are finding CUDA does not work for newer version of the CUBLAS library. After 10.0, e.g. with 10.1,

The colab notebook https://colab.research.google.com/drive/1qbHUhNZUX7AYEpqnZyf29Lrz2IPHBGlX?usp=sharing is using cuda 10.1; however, there is no problem to compile k2 in colab.

I am using cuda 10.1 locally, though it is not installed via apt-get.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/k2/issues/225#issuecomment-704222536, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO34HAFOCWZURP6VLRLSJMBU5ANCNFSM4SFRMUKQ .

csukuangfj commented 3 years ago

In colab notebook, it is /usr/lib/x86_64-linux-gnu/libcublas.so, while it is /path/to/my/cuda/lib64/libcublas.so locally.

csukuangfj commented 3 years ago

sorry, I am using cuda 10.2 locally.