huggingface / text-embeddings-inference

A blazing fast inference solution for text embeddings models
https://huggingface.co/docs/text-embeddings-inference/quick_tour
Apache License 2.0
2.53k stars 158 forks source link

Support TEI on AMD GPUs #108

Open japarada opened 8 months ago

japarada commented 8 months ago

Feature request

Are there active plans to add support for generating embedding using AMD GPUs?

Motivation

AMD and Hugging Face are currently engaged in an ongoing collaboration to ensure compatibility of Hugging Face transformers with AMD ROCm and hardware platforms. Providing support for HF TEI toolkit would add a compelling alternative for deploying and serving open-source text embeddings and sequence classification models.

Your contribution

Work with other on PRs to integrate changes to support AMD GPUs.

OlivierDehaene commented 8 months ago

Indeed it would be interesting to support more backends. Adding support for AMD GPUs will happen in Candle tough, not in TEI directly.

japarada commented 7 months ago

@OlivierDehaene Thanks for the response. Does TEI support python backend instead of Candle? I see https://github.com/huggingface/text-embeddings-inference/tree/d05c949c1234786c15b675f4419776a417519583/backends/python? Is this python code only for the grpc server implementation?

OlivierDehaene commented 7 months ago

This backend is an example of how you would go about adding other backends to TEI. It is currently out-dated but could be updated to support AMD.

japarada commented 7 months ago

Are there any active effort support to make python backend fully functional? What models can be run at this moment? Thank you for all the answers.

hvico commented 3 months ago

Hi! Is there any WiP regarding ROCm support for TEI? Thanks!

dcbark01 commented 3 months ago

Hello? Is there any WiP regarding ROCm support for TEI? Thanks!

I'm interested in this as well. May start working on it myself, but don't want to duplicate efforts if it's already in the works.

fxmarty commented 2 months ago

Hi, there is some progress in https://github.com/huggingface/text-embeddings-inference/pull/293. Would you mind sharing which AMD GPUs you are using? Thank you!

dcbark01 commented 2 months ago

Currently using MI250s.