unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.7k stars 1.31k forks source link

[Feature Request] AMD GPU #37

Open fakerybakery opened 11 months ago

fakerybakery commented 11 months ago

Hi, Does Unsloth support AMD GPUs? Thank you!

danielhanchen commented 11 months ago

Technically yes - support for AMD and Intel GPUs are possible through each vendor's respective implementation of Triton. The OSS version however relies on bitsandbytes, which only supports CUDA.

asmith26 commented 1 month ago

Looks like bitsandbytes now supports AMD :) https://x.com/Tim_Dettmers/status/1841151115326570639

shimmyshimmer commented 1 month ago

Looks like bitsandbytes now supports AMD :) https://x.com/Tim_Dettmers/status/1841151115326570639

Fantastic news - will do when we get the time however it will be lower priority for now.

sayanmndl21 commented 1 month ago

I successfully ran Unsloth on the MI210 GPU. The issue was related to Triton’s constraints, where the block size and warp size must not exceed (2^{14}) and 16, respectively.

Here’s the commit implementing HIP support: Commit for HIP Support

Branch: Unsloth HIP

Environment Details:

Relevant Pip Packages:

Feedback appreciated!

Alfeusg commented 1 week ago

Relevant Pip Packages:

  • bitsandbytes==0.43.3.dev0 (from official ROCm)

  • huggingface==0.0.1

  • transformers==4.44.2

  • trl==0.11.1

  • xformers==0.0.29+77c1da7f.d20241019 (from official ROCm)

What do you mean by a package being from "(from official ROCm)" How exactly can I install these specific packages/versions for ROCm 6.2?

sayanmndl21 commented 1 week ago

Relevant Pip Packages:

  • bitsandbytes==0.43.3.dev0 (from official ROCm)

  • huggingface==0.0.1

  • transformers==4.44.2

  • trl==0.11.1

  • xformers==0.0.29+77c1da7f.d20241019 (from official ROCm)

What do you mean by a package being from "(from official ROCm)" How exactly can I install these specific packages/versions for ROCm 6.2?

@Alfeusg please check: ROCm documentation for ML Acceleration