csukuangfj / kaldifeat

Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
https://csukuangfj.github.io/kaldifeat
Other
186 stars 35 forks source link

Install error with both pip/conda (system MKL installed) #63

Closed trunglebka closed 1 year ago

trunglebka commented 1 year ago

I can not install kaldifeat alongside with k2. Steps:

  1. conda create --name k2 -c k2-fsa -c pytorch -c nvidia k2 pytorch pytorch-cuda=11.7 torchaudio => OK

  2. conda install -c kaldifeat kaldifeat => Conflicting/ERROR: kaldifeat-cuda-install-error.log

    Conda failed so I've tried to install via pip. Steps:

  3. pip install --verbose kaldifeat => Failed with message saying that it can't find cuDNN: kaldifeat-pip-install-error.log

  4. Try to install cuDNN from conda: conda install -c conda-forge cudnn => OK with cuDNN-conda-install-message:

  5. Retry install kaldifeat via pip: pip install --verbose kaldifeat: .......annd- it failed with message saying that can not find mkl: kaldifeat-pip-install-error-cudnn-installed.log

  6. Following the suggestion from #10, and it failed with problem of cuda: k2-pip-install-error-mkl-provided.log

  7. Asking myself: "I have installed/trained/ran k2, icefall, kaldifeat, sherpa in this way in a server running ubuntu 20.04 why I can't reproduce it in my local machine, so I exported conda environment of the server and my local machine.... And I still can not make it work :/". Here is the two conda environments: conda environments

  8. Trying everything for a day without hope and go here to call for help....

    cuDNN-conda-install-message:

    
    (k2) conda install -c conda-forge cudnn
    Collecting package metadata (current_repodata.json): done
    Solving environment: done

==> WARNING: A newer version of conda exists. <== current version: 22.9.0 latest version: 22.11.1

Please update conda by running

$ conda update -n base -c conda-forge conda

Package Plan

environment location: /home/trungle/opt/anaconda3/envs/k2

added / updated specs:

The following NEW packages will be INSTALLED:

cudatoolkit conda-forge/linux-64::cudatoolkit-10.2.89-h713d32c_10 None cudnn conda-forge/linux-64::cudnn-7.6.5.32-h01f27c4_1 None

The following packages will be UPDATED:

ca-certificates pkgs/main::ca-certificates-2022.10.11~ --> conda-forge::ca-certificates-2022.12.7-ha878542_0 None certifi pkgs/main/linux-64::certifi-2022.9.24~ --> conda-forge/noarch::certifi-2022.12.7-pyhd8ed1ab_0 None

csukuangfj commented 1 year ago

conda install -c kaldifeat kaldifeat => Conflicting/ERROR: kaldifeat-cuda-install-error.log

Could you use

conda create --name k2 -c k2-fsa -c pytorch -c nvidia -c kaldifeat k2 pytorch pytorch-cuda=11.7 torchaudio kaldifeat

That is, install kaldifeat along with k2 and pytorch.


Retry install kaldifeat via pip: pip install --verbose kaldifeat: .......annd- it failed with message saying that can not find mkl

  [ 50%] Linking CXX shared library ../../lib/libkaldifeat_core.so
  /usr/bin/ld: cannot find -lmkl_intel_ilp64
  /usr/bin/ld: cannot find -lmkl_core
  /usr/bin/ld: cannot find -lmkl_intel_thread

Are you able to find libmkl_intel_ilp64.so by

find $CONDA_PREFIX -name "libmkl_intel_ilp64*"

If yes, please use

export LIBRARY_PATH=$CONDA_PREFIX/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

and then re-run pip install --verbose kaldifeat.


and it failed with problem of cuda

  -- Caffe2: Header version is: /tmp/pip-install-zy39m72l/kaldifeat_9e24c905f438489ba3826de1f701fb8e/build/temp.linux-x86_64-cpython-310/CMakeFiles/CMakeTmp/cmTC_b829c: error while loading shared libraries: libcudart.so.10.2: cannot open shared object file: No such file or directory

    error while loading shared libraries: libcudart.so.10.2: cannot open shared

Please describe how you installed cudatoolkit and pytorch. It looks to me that it is using a version of PyTorch with cuda 10.2


And I still can not make it work :/". Here is the two conda environments Trying everything for a day without hope and go here to call for help....

Sorry for the installation issues. Please ask for help as early as possible.

trunglebka commented 1 year ago

conda create --name k2 -c k2-fsa -c pytorch -c nvidia -c kaldifeat k2 pytorch pytorch-cuda=11.7 torchaudio kaldifeat

Using your install command now it is installed correctly and I can configure sherpa successfully.

With regarding to your remaining comment, I've provided the steps that I used - almost is conda install commands except kaldifeat, mkl do exists... So the problem is really weird for me

Anyway, thank you for your hard work in open source ASR frameworks.

trunglebka commented 1 year ago

Hi, it's me again. After creating new conda env using your suggestion, I tried to build sherpa and when I ran sherpa/bin/pruned_transducer_statelessX/offline_server.py, it show an error that I think related to kaldifeat.

Error message:

arg(): could not convert default argument 'fbank_opts: kaldifeat::FbankOptions' in method '<class '_sherpa.FeatureConfig'>.__init__' into a Python object (type not registered yet?)
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/beam_search.py", line 24, in <module>
    from sherpa import RnntConformerModel, greedy_search, modified_beam_search
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/offline_server.py", line 45, in <module>
    from beam_search import GreedySearchOffline, ModifiedBeamSearchOffline

image

csukuangfj commented 1 year ago

Are you using the latest sherpa, i.e., the master branch of sherpa?

trunglebka commented 1 year ago

Yes, here is the check:

(k2) git status
On branch master
Your branch is up to date with 'origin/master'.
trunglebka commented 1 year ago

An addition information: my computer has two version of gcc 12.2.0 (default) and 10.4.0 (custom installed). I've building, running kaldi successfully in that computer but k2 ecosystem mixing C++ and python really complicate for me to resolve problems myself

trunglebka commented 1 year ago

Here is reproducible Dockerfile Note: it's using fedora 36 that come with default gcc/g++ version 12.2

FROM fedora:36

RUN dnf install -y wget git

WORKDIR /opt
ENV CONDA_INSTALLER=Miniconda3-py39_4.12.0-Linux-x86_64.sh
RUN wget https://repo.anaconda.com/miniconda/${CONDA_INSTALLER} && \
    bash ${CONDA_INSTALLER} -p /opt/miniconda -b && \
    rm ${CONDA_INSTALLER}
ENV PATH=/opt/miniconda/bin:${PATH}

RUN conda create --name k2 -c k2-fsa -c pytorch -c nvidia -c kaldifeat \
    k2 pytorch pytorch-cuda=11.7 torchaudio kaldifeat

RUN conda install -n k2 \
    -y -c conda-forge \
    websockets sentencepiece
RUN git clone https://github.com/k2-fsa/sherpa

ENV CMAKE_ARCHIVE=cmake-3.22.6-linux-x86_64.tar.gz
RUN wget https://cmake.org/files/v3.22/${CMAKE_ARCHIVE} && \
    tar -xzvf ${CMAKE_ARCHIVE} && \
    mv cmake-3.22.6-linux-x86_64 /opt/cmake && \
    rm ${CMAKE_ARCHIVE}
ENV PATH="/opt/cmake/bin:$PATH"

RUN dnf update -y && \
    dnf groupinstall -y "Development Tools" && \
    dnf install -y g++ gcc-c++ which

WORKDIR /opt/sherpa
ENV PATH=/opt/miniconda/envs/k2/bin:${PATH}
RUN python3 setup.py install --verbose

# may need to install gcc-12 due to linkage error
# command: `conda install -n k2 -c conda-forge gcc=12.2`
csukuangfj commented 1 year ago

By the way, how did you install sherpa?

Have you installed sherpa before? (If yes, have you uninstalled it? I suspect that the error is caused by a previous version of sherpa).

trunglebka commented 1 year ago

I've provided Dockerfile that has same problem that I encountered even if it is clean install. In my local machine I just install env k2 as you suggest and install sherpa after that. Every thing build without problem but when running it show the error I just commented, remove and recreate env did not help.

csukuangfj commented 1 year ago

Could you run

$ sherpa-version

in your terminal and post the output?

csukuangfj commented 1 year ago

image

Could you step into sherpa/__init__.py and check at which line it fails?

I could not reproduce it locally, however.

trunglebka commented 1 year ago

Here is the sherpa-version output from build dir (can not find it in PATH)

❯ build/lib.linux-x86_64-cpython-310/sherpa/bin/sherpa-version
sherpa version: 1.1
build type: Debug
OS used to build sherpa: 
sherpa git sha1: 21cd0e922876b4d4fba491d6f760fbc4d1a24781
sherpa git date: Mon Dec 12 20:03:54 2022
sherpa git branch: master
PyTorch version used to build sherpa: 1.13.0
CUDA version: 
cuDNN version: 
k2 version used to build sherpa: 1.23.2
k2 git sha1: 1feafa064cf3b6c243e6b33b0192601224210937
k2 git date: Fri Nov 25 08:23:51 2022
k2 with cuda: OFF
kaldifeat version used to build sherpa: 1.22
cmake version: 3.22.2
compiler ID: GNU
compiler: /usr/bin/c++
compiler version: 12.2.1
cmake CXX flags:    -Wall  -g  -D_GLIBCXX_USE_CXX11_ABI=0   -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-unused-variable  -Wno-strict-overflow   -D_GLIBCXX_USE_CXX11_ABI=0
Python version: 3.10

Here is the error message

Exception has occurred: ImportError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
arg(): could not convert default argument 'fbank_opts: kaldifeat::FbankOptions' in method '<class '_sherpa.FeatureConfig'>.__init__' into a Python object (type not registered yet?)
  File "/home/trungle/opt/anaconda3/envs/k2/lib/python3.10/site-packages/k2_sherpa-1.1-py3.10-linux-x86_64.egg/sherpa/__init__.py", line 12, in <module>
    from _sherpa import (
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/beam_search.py", line 24, in <module>
    from sherpa import RnntConformerModel, greedy_search, modified_beam_search
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/offline_server.py", line 45, in <module>
    from beam_search import GreedySearchOffline, ModifiedBeamSearchOffline
  File "/home/trungle/opt/anaconda3/envs/k2/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/trungle/opt/anaconda3/envs/k2/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

image

csukuangfj commented 1 year ago

Are you able to run


(py38) kuangfangjun:~$ python3
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
>>>
(py38) kuangfangjun:~$ python3
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import kaldifeat
>>> import _sherpa
>>>
trunglebka commented 1 year ago

My local (the failed one):

➜ conda activate k2
(k2) python3
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/trungle/opt/anaconda3/envs/k2/lib/python3.10/site-packages/k2_sherpa-1.1-py3.10-linux-x86_64.egg/sherpa/__init__.py", line 12, in <module>
    from _sherpa import (
ImportError: arg(): could not convert default argument 'fbank_opts: kaldifeat::FbankOptions' in method '<class '_sherpa.FeatureConfig'>.__init__' into a Python object (type not registered yet?)
>>> 
>>> 
(k2) python3
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import kaldifeat
>>> import _sherpa
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: arg(): could not convert default argument 'fbank_opts: kaldifeat::FbankOptions' in method '<class '_sherpa.FeatureConfig'>.__init__' into a Python object (type not registered yet?)
>>> 

My server (succeed one)

(k2) ➜  ~ python
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
>>> 
(k2) ➜  ~ import kaldifeat
(k2) ➜  ~ python
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import kaldifeat
>>> import _sherpa
>>> 
csukuangfj commented 1 year ago

That is strange.

Did you use the same method to install python on your server and on your local machine?

trunglebka commented 1 year ago

I don't remember correctly, just using the history of the shell (~/.zsh_history). Try to reproduce it with my local machine, and can't install all the stubs successfully as the original problem that I reported

csukuangfj commented 1 year ago

Could you try to also install kaldifeat from source, like what you are doing for sherpa?

trunglebka commented 1 year ago

I've tried docker with Rocky Linux 8 (gcc 8.5.0), Fedora 35 (gcc 11.3.1), ubuntu 20.04 (gcc 9.4.0) All with same error:

root@49f780f91fe3:/opt/sherpa# /home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/offline_server.py --port 8888 --nn-model-filename /home/chicky/workspace/github/icefall/egs/gigaspeech/ASR/pruned_transducer_stateless2/jit-icefall-asr-gigaspeech-pruned-transducer-stateless2/exp/cpu_jit-epoch-29-avg-8-torch-1.10.0.pt --bpe-model-filename /home/chicky/workspace/github/icefall/egs/gigaspeech/ASR/data/lang_bpe_500.BAK/bpe.model --num-device 0
Traceback (most recent call last):
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/offline_server.py", line 45, in <module>
    from beam_search import GreedySearchOffline, ModifiedBeamSearchOffline
  File "/home/chicky/workspace/github/sherpa/sherpa/bin/pruned_transducer_statelessX/beam_search.py", line 24, in <module>
    from sherpa import RnntConformerModel, greedy_search, modified_beam_search
  File "/opt/miniconda/envs/k2/lib/python3.10/site-packages/k2_sherpa-1.1-py3.10-linux-x86_64.egg/sherpa/__init__.py", line 12, in <module>
    from _sherpa import (
ImportError: arg(): could not convert default argument into a Python object (type not registered yet?). Compile in debug mode for more information.
root@49f780f91fe3:/opt/sherpa# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS"
root@49f780f91fe3:/opt/sherpa# gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The Dockerfile using ubuntu:

FROM ubuntu:20.04

RUN apt update && \
    apt install -y build-essential wget git

WORKDIR /opt
ENV CONDA_INSTALLER=Miniconda3-py39_4.12.0-Linux-x86_64.sh
RUN wget https://repo.anaconda.com/miniconda/${CONDA_INSTALLER} && \
    bash ${CONDA_INSTALLER} -p /opt/miniconda -b && \
    rm ${CONDA_INSTALLER}
ENV PATH=/opt/miniconda/bin:${PATH}

RUN conda create --name k2 -c k2-fsa -c pytorch -c nvidia -c kaldifeat \
    k2 pytorch pytorch-cuda=11.7 torchaudio kaldifeat

RUN conda install -n k2 \
    -y -c conda-forge \
    websockets sentencepiece
RUN git clone https://github.com/k2-fsa/sherpa

ENV CMAKE_ARCHIVE=cmake-3.22.6-linux-x86_64.tar.gz
RUN wget https://cmake.org/files/v3.22/${CMAKE_ARCHIVE} && \
    tar -xzvf ${CMAKE_ARCHIVE} && \
    mv cmake-3.22.6-linux-x86_64 /opt/cmake && \
    rm ${CMAKE_ARCHIVE}
ENV PATH="/opt/cmake/bin:$PATH"

WORKDIR /opt/sherpa
ENV PATH=/opt/miniconda/envs/k2/bin:${PATH}
RUN python3 setup.py install --verbose

# may need to install gcc-12 due to linkage error
# command: `conda install -n k2 -c conda-forge gcc=12.2`

Ubuntu 20.04 failed that is very weird since I'm using sherpa in Ubuntu 20.04 server (but with different installing procedure). I will retry more.

trunglebka commented 1 year ago

Update 1: Creating new env using your suggested command in my server that other env is running sherpa successfully, the same problem occurred image

trunglebka commented 1 year ago

Update 2: the following steps is work in my server:

  1. conda create --name k3 -c k2-fsa -c pytorch -c nvidia -c kaldifeat k2 pytorch pytorch-cuda=11.7 torchaudio
  2. conda activate k3
  3. pip install kaldifeat
  4. Step 3 complain about cuDNN and using my previous history, so I run the following: conda install -c "nvidia/label/cuda-11.7.1" cuda-toolkit => conda install -c conda-forge cudnn
  5. pip install kaldifeat
  6. install sherpa from source python3 setup.py install --verbose

Sample:

(k3) ➜  sherpa git:(master) python
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
>>> import kaldifeat
>>> import _sherpa
>>>
trunglebka commented 1 year ago

Update 3: if I build previous steps using docker, it complains about cuda:

  CUDA_TOOLKIT_ROOT_DIR not found or specified
  -- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
  CMake Warning at /opt/miniconda/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
    Caffe2: CUDA cannot be found.  Depending on whether you are building Caffe2
    or a Caffe2 dependent library, the next warning / error will give you more
    info.
  Call Stack (most recent call first):
    /opt/miniconda/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:92 (include)
    /opt/miniconda/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
    cmake/torch.cmake:14 (find_package)
    CMakeLists.txt:56 (include)

  CMake Error at /opt/miniconda/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:94 (message):
    Your installed Caffe2 version uses CUDA but I cannot find the CUDA
    libraries.  Please set the proper CUDA prefixes and / or install CUDA.
  Call Stack (most recent call first):
    /opt/miniconda/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
    cmake/torch.cmake:14 (find_package)
    CMakeLists.txt:56 (include)

Here is the content of Dockerfile that I used:

FROM ubuntu:20.04

RUN apt update && \
    apt install -y build-essential wget git

WORKDIR /opt
ENV CONDA_INSTALLER=Miniconda3-py39_4.12.0-Linux-x86_64.sh
RUN wget https://repo.anaconda.com/miniconda/${CONDA_INSTALLER} && \
    bash ${CONDA_INSTALLER} -p /opt/miniconda -b && \
    rm ${CONDA_INSTALLER}
ENV PATH=/opt/miniconda/bin:${PATH}

RUN conda create --name k2 -c k2-fsa -c pytorch -c nvidia \
    k2 pytorch pytorch-cuda=11.7 torchaudio

ENV CMAKE_ARCHIVE=cmake-3.22.6-linux-x86_64.tar.gz
RUN wget https://cmake.org/files/v3.22/${CMAKE_ARCHIVE} && \
    tar -xzvf ${CMAKE_ARCHIVE} && \
    mv cmake-3.22.6-linux-x86_64 /opt/cmake && \
    rm ${CMAKE_ARCHIVE}
ENV PATH="/opt/cmake/bin:$PATH"

RUN conda install -n k2 -y  -c conda-forge cudnn && \
    conda install -n k2 -y  -c "nvidia/label/cuda-11.7.1" cuda-toolkit 

RUN conda install -n k2 -y -c conda-forge cuda=11.7

RUN /opt/miniconda/envs/k2/bin/pip install --verbose kaldifeat

RUN conda install -n k2 \
    -y -c conda-forge \
    websockets sentencepiece
RUN git clone https://github.com/k2-fsa/sherpa

WORKDIR /opt/sherpa
ENV PATH=/opt/miniconda/envs/k2/bin:${PATH}
RUN python3 setup.py install --verbose

# may need to install gcc-12 due to linkage error
# command: `conda install -n k2 -c conda-forge gcc=12.2`
trunglebka commented 1 year ago

Using the procedure in https://github.com/csukuangfj/kaldifeat/issues/63#issuecomment-1353394777 (conda install -c "nvidia/label/cuda-11.7.1" cuda-toolkit is meaningless), now I'm installed and loads sherpa successfully.

If building sherpa using gcc-10 (set priority via PATH) can run it without any further steps. If building sherpa using gcc-12 I need to install gcc 12 into conda

The steps is basically same as what I've posted in original issue. Not sure what change I've made to my computer/conda now I even can not reproduce the issue @@~. -> Update: My local computer has oneAPI/MKL installed but not in library search path. If I move /opt/intel to /opt/intel.BAK I can install successfully as this comment, otherwise with oneAPI installed using intel tarball install process is failed as report in original issue. My server also has system mkl installed but it has library path entry for it (seem like I installed it using kaldi tools).

(k3) python  
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/trungle/opt/anaconda3/envs/k3/lib/python3.10/site-packages/k2_sherpa-1.1-py3.10-linux-x86_64.egg/sherpa/__init__.py", line 12, in <module>
    from _sherpa import (
ImportError: /home/trungle/opt/anaconda3/envs/k3/lib/python3.10/site-packages/torch/lib/../../../.././libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /home/trungle/opt/anaconda3/envs/k3/lib/python3.10/site-packages/k2_sherpa-1.1-py3.10-linux-x86_64.egg/sherpa/lib/libsherpa_kaldi_native_io_core.so)
>>> 
(k3) conda install -c conda-forge gcc=12.2
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 22.9.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c conda-forge conda

## Package Plan ##

  environment location: /home/trungle/opt/anaconda3/envs/k3

  added / updated specs:
    - gcc=12.2

The following NEW packages will be INSTALLED:

  binutils_impl_lin~ conda-forge/linux-64::binutils_impl_linux-64-2.39-he00db2b_1 None
  gcc                conda-forge/linux-64::gcc-12.2.0-h26027b1_11 None
  gcc_impl_linux-64  conda-forge/linux-64::gcc_impl_linux-64-12.2.0-hcc96c02_19 None
  kernel-headers_li~ conda-forge/noarch::kernel-headers_linux-64-2.6.32-he073ed8_15 None
  libgcc-devel_linu~ conda-forge/linux-64::libgcc-devel_linux-64-12.2.0-h3b97bd3_19 None
  libsanitizer       conda-forge/linux-64::libsanitizer-12.2.0-h46fd767_19 None
  sysroot_linux-64   conda-forge/noarch::sysroot_linux-64-2.12-he073ed8_15 None

The following packages will be UPDATED:

  ld_impl_linux-64   pkgs/main::ld_impl_linux-64-2.38-h118~ --> conda-forge::ld_impl_linux-64-2.39-hcc3a1bd_1 None
  libgcc-ng          pkgs/main::libgcc-ng-11.2.0-h1234567_1 --> conda-forge::libgcc-ng-12.2.0-h65d4601_19 None
  libgomp              pkgs/main::libgomp-11.2.0-h1234567_1 --> conda-forge::libgomp-12.2.0-h65d4601_19 None
  libstdcxx-ng       pkgs/main::libstdcxx-ng-11.2.0-h12345~ --> conda-forge::libstdcxx-ng-12.2.0-h46fd767_19 None
  openssl              pkgs/main::openssl-1.1.1s-h7f8727e_0 --> conda-forge::openssl-1.1.1s-h0b41bf4_1 None

The following packages will be SUPERSEDED by a higher-priority channel:

  _libgcc_mutex           pkgs/main::_libgcc_mutex-0.1-main --> conda-forge::_libgcc_mutex-0.1-conda_forge None
  _openmp_mutex          pkgs/main::_openmp_mutex-5.1-1_gnu --> conda-forge::_openmp_mutex-4.5-2_gnu None

Proceed ([y]/n)? 

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Retrieving notices: ...working... done
(k3) python                               
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sherpa
>>> import kaldifeat
>>> import _sherpa
>>> 
csukuangfj commented 1 year ago

Thanks for your effort in reproducing.

I am not sure what causes the problem. Sorry, I don't know how to fix it.

It looks to me that if you use pip install kaldifeat (which installs kaldfeat from source) instead of conda install (which uses pre-compiled kaldifeat), it runs fine, right?

trunglebka commented 1 year ago

I've updated my last comment, it seem like there is a problem with linking system MKL that not in library search path

csukuangfj commented 1 year ago

The steps is basically same as what I've posted in original issue. Not sure what change I've made to my computer/conda now I even can not reproduce the issue @@~

-> Update: My local computer has oneAPI/MKL installed but not in library search path. If I move /opt/intel to /opt/intel.BAK I can install successfully as this comment, otherwise with oneAPI installed using intel tarball install process is failed as report in original issue. My server also has system mkl installed but it has library path entry for it (seem like I installed it using kaldi tools).

Does it mean when you switch from GCC 11.2.0 to 12.2.0, it works without reinstalling sherpa and kaldifeat?

trunglebka commented 1 year ago

No, Gcc version is just addition infomation. The problem is linking kaldifeat_core with system mkl (/opt/intel...) that not in library search path.

trunglebka commented 1 year ago

I think I found the cause:

So the problem seem cleared, mainly because my MKL installation is not follows "standard" as CMake and Pytorch expected (Even if it is installed from official intel tarball)

csukuangfj commented 1 year ago

I think I found the cause:

  • First torch/caffe "passive" link to mkl: /opt/anaconda3/envs/k2/lib/python3.10/site-packages/torch/share/cmake/Caffe2/public/mkl.cmake, this link will be resolved at runtime with conda envs/k2/lib
  • But if a computer installed mkl at /opt/intel, CMake will resolve it (in building process), and my computer does not provide library search path to this directory, so it failed
  • If I set this variable set(CMAKE_DISABLE_FIND_PACKAGE_MKL TRUE) before include torch.cmake (in both kaldifeat and sherpa), both can be installed.

So the problem seem cleared, mainly because my MKL installation is not follows "standard" as CMake and Pytorch expected (Even if it is installed from official intel tarball)

Great to hear that you find out the reason.

I am surprised that the link issues with MKL will lead to such an error when import sherpa.

Would you mind creating a pull-request (both in kaldifeat and in sherpa) to fix it?

Thanks !