mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library
https://mit-han-lab.github.io/TinyChatEngine/
MIT License
715 stars 68 forks source link

fix matrix3d int type error for windows #81

Closed xieqihui closed 7 months ago

xieqihui commented 10 months ago

Fix the type of position_ids_buf from float to int for windows.

This PR solves below error when building on Windows:

$ make chat -j
CUDA is unavailable!
src/GPTBigCodeGenerate.cc src/GPTBigCodeTokenizer.cc src/Generate.cc src/LLaMATokenizer.cc src/OPTGenerate.cc src/OPTTokenizer.cc src/utils.cc src/nn_modules/Fp32GPTBigCodeAttention.cc src/nn_modules/Fp32GPTBigCodeDecoder.cc src/nn_modules/Fp32GPTBigCodeDecoderLayer.cc src/nn_modules/Fp32GPTBigCodeForCausalLM.cc src/nn_modules/Fp32OPTAttention.cc src/nn_modules/Fp32OPTDecoder.cc src/nn_modules/Fp32OPTDecoderLayer.cc src/nn_modules/Fp32OPTForCausalLM.cc src/nn_modules/Fp32llamaAttention.cc src/nn_modules/Fp32llamaDecoder.cc src/nn_modules/Fp32llamaDecoderLayer.cc src/nn_modules/Fp32llamaForCausalLM.cc src/nn_modules/Int4GPTBigCodeAttention.cc src/nn_modules/Int4GPTBigCodeDecoder.cc src/nn_modules/Int4GPTBigCodeDecoderLayer.cc src/nn_modules/Int4GPTBigCodeForCausalLM.cc src/nn_modules/Int4OPTAttention.cc src/nn_modules/Int4OPTDecoder.cc src/nn_modules/Int4OPTDecoderLayer.cc src/nn_modules/Int4OPTForCausalLM.cc src/nn_modules/Int8OPTAttention.cc src/nn_modules/Int8OPTDecoder.cc src/nn_modules/Int8OPTDecoderLayer.cc src/nn_modules/OPTForCausalLM.cc src/ops/BMM_F32T.cc src/ops/BMM_S8T_S8N_F32T.cc src/ops/BMM_S8T_S8N_S8T.cc src/ops/LayerNorm.cc src/ops/LayerNormQ.cc src/ops/LlamaRMSNorm.cc src/ops/RotaryPosEmb.cc src/ops/W8A8B8O8Linear.cc src/ops/W8A8B8O8LinearReLU.cc src/ops/W8A8BFP32OFP32Linear.cc src/ops/arg_max.cc src/ops/batch_add.cc src/ops/embedding.cc src/ops/linear.cc src/ops/softmax.cc ../kernels/matmul_imp.cc ../kernels/matmul_int4.cc ../kernels/matmul_int8.cc ../kernels/pthread_pool.cc src/nn_modules/non_cuda/Int4llamaAttention.cc src/nn_modules/non_cuda/Int4llamaDecoder.cc src/nn_modules/non_cuda/Int4llamaDecoderLayer.cc src/nn_modules/non_cuda/Int4llamaForCausalLM.cc src/nn_modules/non_cuda/LLaMAGenerate.cc ../kernels/avx/matmul_avx_fp32.cc ../kernels/avx/matmul_avx_int4.cc ../kernels/avx/matmul_avx_int8.cc ../kernels/avx/matmul_avx_int8_int4.cc
g++ -std=c++11 -pthread -Ofast  -mavx2 -mfma -ffast-math -DUSE_INT8_INT4_PRODUCT -fpermissive -DQM_x86 -I../kernels -I./include -I./include/nn_modules -I./json/single_include/ -I./half-2.2.0/include/ -c src/nn_modules/Fp32GPTBigCodeDecoder.cc -o build/transformer/src/nn_modules/Fp32GPTBigCodeDecoder.o
g++ -std=c++11 -pthread -Ofast  -mavx2 -mfma -ffast-math -DUSE_INT8_INT4_PRODUCT -fpermissive -DQM_x86 -I../kernels -I./include -I./include/nn_modules -I./json/single_include/ -I./half-2.2.0/include/ -c src/nn_modules/Int4GPTBigCodeDecoder.cc -o build/transformer/src/nn_modules/Int4GPTBigCodeDecoder.o
In file included from ./include/operators.h:6,
                 from ./include/nn_modules/Int4GPTBigCodeAttention.h:4,
                 from ./include/nn_modules/Int4GPTBigCodeDecoderLayer.h:1,
                 from ./include/nn_modules/Int4GPTBigCodeDecoder.h:5,
                 from src/nn_modules/Int4GPTBigCodeDecoder.cc:1:
../kernels/matmul.h:5: warning: "NOMINMAX" redefined
    5 | #define NOMINMAX
      |
In file included from C:/msys64/mingw64/include/c++/13.2.0/x86_64-w64-mingw32/bits/c++config.h:679,
                 from C:/msys64/mingw64/include/c++/13.2.0/cstdlib:41,
                 from ./include/nn_modules/Int4GPTBigCodeDecoder.h:1:
C:/msys64/mingw64/include/c++/13.2.0/x86_64-w64-mingw32/bits/os_defines.h:45: note: this is the location of the previous definition
   45 | #define NOMINMAX 1
      |
In file included from ./include/operators.h:6,
                 from ./include/nn_modules/Fp32GPTBigCodeAttention.h:4,
                 from ./include/nn_modules/Fp32GPTBigCodeDecoderLayer.h:1,
                 from ./include/nn_modules/Fp32GPTBigCodeDecoder.h:5,
                 from src/nn_modules/Fp32GPTBigCodeDecoder.cc:1:
../kernels/matmul.h:5: warning: "NOMINMAX" redefined
    5 | #define NOMINMAX
      |
In file included from C:/msys64/mingw64/include/c++/13.2.0/x86_64-w64-mingw32/bits/c++config.h:679,
                 from C:/msys64/mingw64/include/c++/13.2.0/cstdlib:41,
                 from ./include/nn_modules/Fp32GPTBigCodeDecoder.h:1:
C:/msys64/mingw64/include/c++/13.2.0/x86_64-w64-mingw32/bits/os_defines.h:45: note: this is the location of the previous definition
   45 | #define NOMINMAX 1
      |
src/nn_modules/Fp32GPTBigCodeDecoder.cc: In member function 'Fp32GPTBigCodeDecoder_output Fp32GPTBigCodeDecoder::forward(const Fp32GPTBigCodeDecoder_input&)':
src/nn_modules/Fp32GPTBigCodeDecoder.cc:115:61: error: no matching function for call to 'Matrix3D<int>::Matrix3D(float*&, int, int, int&)'
  115 |     Matrix3D<int> position_ids(position_ids_buf, 1, 1, sqlen);
      |                                                             ^
In file included from ./include/nn_modules/Fp32GPTBigCodeAttention.h:3:
./include/common.h:125:5: note: candidate: 'Matrix3D<T>::Matrix3D() [with T = int]'
  125 |     Matrix3D() { m_data = NULL; }
      |     ^~~~~~~~
./include/common.h:125:5: note:   candidate expects 0 arguments, 4 provided
./include/common.h:36:5: note: candidate: 'Matrix3D<T>::Matrix3D(T*, int, int, int) [with T = int]'
   36 |     Matrix3D(T *data, int dim_x, int dim_y, int dim_z) : m_data(data), m_dim_x(dim_x), m_dim_y(dim_y), m_dim_z(dim_z) {}
      |     ^~~~~~~~
./include/common.h:36:17: note:   no known conversion for argument 1 from 'float*' to 'int*'
   36 |     Matrix3D(T *data, int dim_x, int dim_y, int dim_z) : m_data(data), m_dim_x(dim_x), m_dim_y(dim_y), m_dim_z(dim_z) {}
      |              ~~~^~~~
./include/common.h:34:7: note: candidate: 'constexpr Matrix3D<int>::Matrix3D(const Matrix3D<int>&)'
   34 | class Matrix3D {
      |       ^~~~~~~~
./include/common.h:34:7: note:   candidate expects 1 argument, 4 provided
./include/common.h:34:7: note: candidate: 'constexpr Matrix3D<int>::Matrix3D(Matrix3D<int>&&)'
./include/common.h:34:7: note:   candidate expects 1 argument, 4 provided
make: *** [Makefile:179: build/transformer/src/nn_modules/Fp32GPTBigCodeDecoder.o] Error 1
make: *** Waiting for unfinished jobs....