QwenLM / qwen.cpp

C++ implementation of Qwen-LM
Other
506 stars 40 forks source link

Support for AMD‘s ROCm #46

Open riverzhou opened 7 months ago

riverzhou commented 7 months ago

Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?

CellerX commented 7 months ago

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

riverzhou commented 7 months ago

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Thanks!

riverzhou commented 7 months ago

image

I modified ggml framework and make it support ROCm. And add ROCm support for qwen.cpp

https://github.com/riverzhou/qwen.cpp

Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.

riverzhou commented 7 months ago

I take a pull requests to upstream ggml and it's merged just now. For now, just add

if (GGML_HIPBLAS)
  add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
  set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()

to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.

louwangzhiyuY commented 7 months ago

can it support AMD Rocm on Windows?