broncotc / bitsandbytes-rocm

MIT License
37 stars 14 forks source link

no matching member function for call to 'BlockedToStriped' #7

Open banianzr opened 1 year ago

banianzr commented 1 year ago

Hi, I'm using Hygon DCU Z100 (It's a AMD-like GPU) in CentOS 7.6 with ROCm 4.0.1. I tried to get this work for stanford alpaca extension for 8bit adam, but I came across problems:

make CUDA_VERSION=gfx1030 hip results in the following:

which: no nvcc in (/public/software/compiler/dtk-22.10.1/bin:/public/software/compiler/dtk-22.10.1/llvm/bin:/public/software/compiler/dtk-22.10.1/hip/bin:/public/software/compiler/dtk-22.10.1/hip/bin/hipify:/public/software/compiler/dtk-22.10.1/miopen/bin:/opt/hpc/software/mpi/hpcx/v2.11.0/gcc-7.3.1/bin:/opt/hpc/software/mpi/hpcx/v2.11.0/hcoll/bin:/opt/hpc/software/mpi/hpcx/v2.11.0/ucx_without_rocm/bin:/opt/rh/devtoolset-7/root/usr/bin:/opt/hpc/setfreq:/opt/gridview/slurm/bin:/opt/gridview/slurm/sbin:/opt/gridview/munge/bin:/opt/gridview/munge/sbin:/opt/clusconf/sbin:/opt/clusconf/bin:/work/home/ac842t8oeo/miniconda3/envs/alpaca/bin:/work/home/ac842t8oeo/miniconda3/condabin:/opt/hpc/setfreq:/usr/lib64/qt-3.3/bin:/work/home/ac842t8oeo/perl5/bin:/opt/gridview/slurm/bin:/opt/gridview/slurm/sbin:/opt/gridview/munge/bin:/opt/gridview/munge/sbin:/opt/clusconf/sbin:/opt/clusconf/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/work/home/ac842t8oeo/.local/bin:/work/home/ac842t8oeo/bin)
# /usr/bin/hipcc -std=c++14 -c -fPIC --amdgpu-target=gfx1030 -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/include -I /public/software/compiler/dtk-22.10.1/include -I/public/software/compiler/dtk-22.10.1/llvm/include -I/public/software/compiler/dtk-22.10.1/hip/include -I/public/software/compiler/dtk-22.10.1/miopen/include -I /opt/rocm/hipcub/include -o /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/build/ops.o -D NO_CUBLASLT /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/ops.cu
# /usr/bin/hipcc -std=c++14 -c -fPIC --amdgpu-target=gfx1030 -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/include -I /public/software/compiler/dtk-22.10.1/include -I/public/software/compiler/dtk-22.10.1/llvm/include -I/public/software/compiler/dtk-22.10.1/hip/include -I/public/software/compiler/dtk-22.10.1/miopen/include -I /opt/rocm/hipcub/include -o /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/build/kernels.o -D NO_CUBLASLT /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu
/public/software/compiler/dtk-22.10.1/bin/hipcc -std=c++14 -c -fPIC --offload-arch=gfx1030 -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/include -I /public/software/compiler/dtk-22.10.1/include -I/public/software/compiler/dtk-22.10.1/llvm/include -I/public/software/compiler/dtk-22.10.1/hip/include -I/public/software/compiler/dtk-22.10.1/miopen/include -I /opt/rocm/hipcub/include -o /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/build/ops.o -D NO_CUBLASLT /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/ops.cu
/public/software/compiler/dtk-22.10.1/bin/hipcc -std=c++14 -c -fPIC --offload-arch=gfx1030 -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc -I /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/include -I /public/software/compiler/dtk-22.10.1/include -I/public/software/compiler/dtk-22.10.1/llvm/include -I/public/software/compiler/dtk-22.10.1/hip/include -I/public/software/compiler/dtk-22.10.1/miopen/include -I /opt/rocm/hipcub/include -o /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/build/kernels.o -D NO_CUBLASLT /work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu

/work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu:1874:40: error: no matching member function for call to 'BlockedToStriped'  
    BlockExchange(temp_storage.exchange).BlockedToStriped(local_col_absmax_values);  
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
/work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu:1899:26: note: in instantiation of function template specialization 'kgetColRowStats<__half, 64, 4, 16, 256, 0>' requested here
template __global__ void kgetColRowStats<half, 64, 4, 16, 64*4, 0>(half * __restrict__ A, float *rowStats, float *colStats, int * nnz_count_row, float nnz_threshold, int rows, int cols, int tiledRows, int tiledCols);                         
/opt/rocm/hipcub/include/hipcub/rocprim/block/block_exchange.hpp:91:10: note: candidate function template not viable: requires 2 arguments, but 1 was provided    
    void BlockedToStriped(InputT (&input_items)[ITEMS_PER_THREAD],         
/work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu:1874:40: error: no matching member function for call to 'BlockedToStriped'  
    BlockExchange(temp_storage.exchange).BlockedToStriped(local_col_absmax_values);  
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
/work/home/ac842t8oeo/bnb-zhr/bitsandbytes-rocm/csrc/kernels.cu:1900:26: note: in instantiation of function template specialization 'kgetColRowStats<__half, 64, 4, 16, 256, 1>' requested here
template __global__ void kgetColRowStats<half, 64, 4, 16, 64*4, 1%3E(half * __restrict__ A, float *rowStats, float *colStats, int * nnz_count_row, float nnz_threshold, int rows, int cols, int tiledRows, int tiledCols);                         
/opt/rocm/hipcub/include/hipcub/rocprim/block/block_exchange.hpp:91:10: note: candidate function template not viable: requires 2 arguments, but 1 was provided    
    void BlockedToStriped(InputT (&input_items)[ITEMS_PER_THREAD],         
2 errors generated when compiling for gfx1030.
make: *** [Makefile:113: hip] Error 1

Any assistance in this matter would be much appreciated, and thanks for your time!