Add support for non-CUDA architectures at the same time Bitsandbytes is doing it

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

134.36k stars 26.87k forks source link

Add support for non-CUDA architectures at the same time Bitsandbytes is doing it #31248

Open sealad886 opened 5 months ago

sealad886 commented 5 months ago

Feature request

Currently, the helper/setup functions explicitly check for CUDA support: https://github.com/huggingface/transformers/blob/8685b3c5d2dd2550527773d2a02499495a759e31/src/transformers/quantizers/quantizer_bnb_4bit.py#L60-L63

BNB is currently doing a project to enable support for other GPU backends: ALPHA TESTERS WANTED

Motivation

Apple MPS support is being added for so many major players, it'd be great for the biggest one of all to support it as its dependencies do. Also would be good to not hard-code this kind of limitation so that code updates aren't necessary as dependent libraries update themselves...

Your contribution

idea done

amyeroberts commented 5 months ago

cc @younesbelkada @SunMarc

younesbelkada commented 5 months ago

Thanks ! We could instead use an environment variable to temporarly bypass this check to easy let users experiment with transformers + bnb multi-backend refactor

WDYT @Titus-von-Koeller ?

Titus-von-Koeller commented 5 months ago

cc for visibility @pnunna93 @Xia-Weiwen @jianan-gu @matthewdouglas for others involved in the multi backend refactor

matthewdouglas commented 5 months ago

I think we're probably going to need some changes in accelerate as well where there's similar device checks.

Something we could consider is the addition of an 'a' suffix on the version number in the refactor branch and check against that. An environment variable sounds reasonable too.

Titus-von-Koeller commented 2 months ago

Will be resolved by #31098. However, Apple Silicon support is still not implemented. For now we will work on enabling use of BNB AMD and Intel backends and once the Apple Silicon implementation will be provided by the community, we might add further necessary tweaks in a separate PR to Transformers.