punica-ai / punica

Serving multiple LoRA finetuned LLM as one
https://arxiv.org/abs/2310.18547
Apache License 2.0
883 stars 40 forks source link

Add support for running on Colab #5

Open dzlab opened 7 months ago

dzlab commented 7 months ago

I'm not able to install this library on Colab. I tried this

git clone https://github.com/punica-ai/punica
cd punica && pip install .

But this is failing with the following error

Processing /content/punica
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'error'
Cloning into 'punica'...
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
abcdabcd987 commented 7 months ago

Updated. Can you try the latest commit and follow the instruction in README?

dzlab commented 7 months ago

it runs into multiple compile issues

  writing manifest file 'punica.egg-info/SOURCES.txt'
  running build_ext
  /usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 11.8
    warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
  building 'punica.ops._kernels' extension
  creating /content/punica/build/temp.linux-x86_64-cpython-310
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc/flashinfer_adapter
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc/rms_norm
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv
  creating /content/punica/build/temp.linux-x86_64-cpython-310/csrc/sgmv_flashinfer
  Emitting ninja build file /content/punica/build/temp.linux-x86_64-cpython-310/build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  [1/6] /usr/local/cuda/bin/nvcc  -I/content/punica/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /content/punica/csrc/bgmv/bgmv_all.cu -o /content/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17
  FAILED: /content/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv/bgmv_all.o
  /usr/local/cuda/bin/nvcc  -I/content/punica/third_party/cutlass/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /content/punica/csrc/bgmv/bgmv_all.cu -o /content/punica/build/temp.linux-x86_64-cpython-310/csrc/bgmv/bgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17
  /content/punica/csrc/bgmv/../flashinfer/vec_dtypes.cuh(935): error: identifier "make_bfloat162" is undefined

  /content/punica/csrc/bgmv/../flashinfer/vec_dtypes.cuh(986): error: identifier "make_bfloat162" is undefined

  /content/punica/csrc/bgmv/bgmv_impl.cuh(181): error: no operator "+=" matches these operands
              operand types are: nv_bfloat16 += float
            detected during:
              instantiation of "void bgmv_expand_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, float) [with feat_in=8, feat_out=768, T=nv_bfloat16]"
  (199): here
              instantiation of "void bgmv_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, int64_t, float) [with feat_in=8, feat_out=768, T=nv_bfloat16]"
  /content/punica/csrc/bgmv/bgmv_all.cu(5): here

  /content/punica/csrc/bgmv/bgmv_impl.cuh(135): error: no operator "+=" matches these operands
              operand types are: nv_bfloat16 += float
            detected during:
              instantiation of "void bgmv_shrink_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, float) [with feat_in=768, feat_out=8, T=nv_bfloat16]"
  (205): here
              instantiation of "void bgmv_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, int64_t, float) [with feat_in=768, feat_out=8, T=nv_bfloat16]"
  /content/punica/csrc/bgmv/bgmv_all.cu(5): here

  /content/punica/csrc/bgmv/bgmv_impl.cuh(181): error: no operator "+=" matches these operands
              operand types are: nv_bfloat16 += float
            detected during:
              instantiation of "void bgmv_expand_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, float) [with feat_in=8, feat_out=1024, T=nv_bfloat16]"
  (199): here
              instantiation of "void bgmv_kernel<feat_in,feat_out,T>(T *, const T *, const T *, const int64_t *, int64_t, int64_t, int64_t, float) [with feat_in=8, feat_out=1024, T=nv_bfloat16]"
  /content/punica/csrc/bgmv/bgmv_all.cu(5): here
yzh119 commented 7 months ago

@dzlab seems your GPU architecture is sm_75, currently, our code base requires sm_80 or later. I'm adding sm75 support now.

SwapnilDreams100 commented 4 months ago

Hey @yzh119 any update on sm_75 support for punica LORA kernels?