NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.81k stars 889 forks source link

undefined reference to `mkl_graph_mxv_plus_times_i32_nomatval_def_i32_i32_bl' #593

Open Zhang-kg opened 1 year ago

Zhang-kg commented 1 year ago

Branch/Tag/Commit

main

Docker Image Version

none

GPU name

A100

CUDA Driver

520.61.05

Reproduced Steps

I created an environment using conda, and when I executed the "conda list" command, the following was displayed.

# packages in environment at /home/zkg/anaconda3/envs/fasterTransformer:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
blas                      1.0                         mkl  
brotlipy                  0.7.0           py310h7f8727e_1002  
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2023.01.10           h06a4308_0  
certifi                   2022.12.7       py310h06a4308_0  
cffi                      1.15.1          py310h5eee18b_3  
charset-normalizer        2.0.4              pyhd3eb1b0_0  
cryptography              39.0.1          py310h9ce1e76_0  
cudatoolkit               11.3.1               h2bc3f7f_2  
ffmpeg                    4.3                  hf484d3e_0    pytorch
freetype                  2.12.1               h4a9f257_0  
giflib                    5.2.1                h5eee18b_3  
gmp                       6.2.1                h295c915_3  
gnutls                    3.6.15               he1e5248_0  
idna                      3.4             py310h06a4308_0  
intel-openmp              2021.4.0          h06a4308_3561  
jpeg                      9e                   h5eee18b_1  
lame                      3.100                h7b6447c_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.38                 h1181459_1  
lerc                      3.0                  h295c915_0  
libdeflate                1.17                 h5eee18b_0  
libffi                    3.4.2                h6a678d5_6  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libiconv                  1.16                 h7f8727e_2  
libidn2                   2.3.2                h7f8727e_0  
libpng                    1.6.39               h5eee18b_0  
libstdcxx-ng              11.2.0               h1234567_1  
libtasn1                  4.19.0               h5eee18b_0  
libtiff                   4.5.0                h6a678d5_2  
libunistring              0.9.10               h27cfd23_0  
libuuid                   1.41.5               h5eee18b_0  
libwebp                   1.2.4                h11a3e52_1  
libwebp-base              1.2.4                h5eee18b_1  
lz4-c                     1.9.4                h6a678d5_0  
mkl                       2021.4.0           h06a4308_640  
mkl-service               2.4.0           py310h7f8727e_0  
mkl_fft                   1.3.1           py310hd6ae3a3_0  
mkl_random                1.2.2           py310h00e6091_0  
ncurses                   6.4                  h6a678d5_0  
nettle                    3.7.3                hbbd107a_1  
numpy                     1.23.5                   pypi_0    pypi
openh264                  2.1.1                h4ff587b_0  
openssl                   1.1.1t               h7f8727e_0  
pillow                    9.4.0           py310h6a678d5_0  
pip                       23.0.1          py310h06a4308_0  
pycparser                 2.21               pyhd3eb1b0_0  
pyopenssl                 23.0.0          py310h06a4308_0  
pysocks                   1.7.1           py310h06a4308_0  
python                    3.10.11              h7a1cb2a_2  
pytorch                   1.12.1          py3.10_cuda11.3_cudnn8.3.2_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
readline                  8.2                  h5eee18b_0  
requests                  2.29.0          py310h06a4308_0  
setuptools                66.0.0          py310h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.41.2               h5eee18b_0  
tbb                       2021.8.0             hdb19cb5_0  
tk                        8.6.12               h1ccaba5_0  
torchaudio                0.12.1              py310_cu113    pytorch
torchvision               0.13.1              py310_cu113    pytorch
typing_extensions         4.5.0           py310h06a4308_0  
tzdata                    2023c                h04d1e81_0  
urllib3                   1.26.15         py310h06a4308_0  
wheel                     0.38.4          py310h06a4308_0  
xz                        5.4.2                h5eee18b_0  
zlib                      1.2.13               h5eee18b_0  
zstd                      1.5.5                hc292b87_0

In the "FasterTransformer/build" folder, execute the following command, but eventually an error occurs.

cmake -DSM=80 -DCMAKE_BUILD_TYPE=Release -DBUILD_PYT=ON -DBUILD_MULTI_GPU=ON ..
make -j

[ 72%] Built target th_gather_tree
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxv_plus_times_i32_def_i32_i32_fp64'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_intel_lp64.so: undefined reference to `mkl_lapack_dcombssq'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxm_dot_aliased_phase2_plus_times_i32_def_i32_i64_bl'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxv_plus_times_fp32_def_i64_i32_fp32'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxv_plus_times_i32_nomatval_def_i32_i64_fp64'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxm_dot_aliased_phase2_plus_times_i64_nomatval_nomaskval_def_i32_i64_bl'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxm_dot_fallback_phase2_plus_times_i32_def_i64_i32_i64'
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxm_gus_phase2_any_pair_bl_def_i32_i32_i64'
...
/usr/bin/ld: /home/zkg/anaconda3/envs/fasterTransformer/lib/python3.10/site-packages/torch/lib/../../../../libmkl_gnu_thread.so: undefined reference to `mkl_graph_mxm_dot_aliased_phase2_plus_times_i64_nomatval_nomaskval_def_i32_i32_bl'
collect2: error: ld returned 1 exit status
make[2]: *** [tests/int8_gemm/CMakeFiles/int8_gemm_test.dir/build.make:158: bin/int8_gemm_test] Error 1
make[1]: *** [CMakeFiles/Makefile2:10912: tests/int8_gemm/CMakeFiles/int8_gemm_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Cydia2018 commented 1 year ago

have you solved this problem?