intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.1k stars 204 forks source link

do you plan to support GPUs? #51

Closed mgrabban closed 7 months ago

mgrabban commented 1 year ago

Hello, is there any plan to support intel discrete GPUs?

kevinintel commented 1 year ago

We plan to support it and first example will be gpt-j

mgrabban commented 1 year ago

We plan to support it and first example will be gpt-j

Great to know that! Is there any estimated timeline for this? Thanks.

rahulunair commented 1 year ago

any updates on this ?. The docs suggest gpu support but dont think the wheels are there..

kevinintel commented 1 year ago

our quantization support GPU, but inference is not ready. We may release GPT-J inference example on Intel GPU after 2 months

rahulunair commented 1 year ago

I would love to try models like alpaca or vicuna on my arc card with LoRA, hopefully you would be releasing fine tuning examples as well

kevinintel commented 1 year ago

I would love to try models like alpaca or vicuna on my arc card with LoRA, hopefully you would be releasing fine tuning examples as well

It depends on https://github.com/intel/intel-extension-for-pytorch to support Arc fine-tuning

rahulunair commented 1 year ago

Hey can i do inference on Intel GPUs using transformers, and also can i fine-tune may be using the intel dgpu max series using huggingface. With the accelerate backend now supporting xpus, is it possible to fine-tune a model and do inference using intel extension for transformers?

kevinintel commented 12 months ago

It didn't support inference on Intel GPU now.

smacz42 commented 11 months ago

our quantization support GPU

Does this statement mean that the methodology that was used to achieve the quantization that allowed falcon-7b-instruct to be ran in 6GB of RAM as outlined in this Medium article is available to Arc GPUs?

Please I am trying to understand the current state of features, and all the parts that are needed to make that work :)

kevinintel commented 10 months ago

for quantization on arc please create a ticket in https://github.com/intel/neural-compressor

kevinintel commented 7 months ago

we support Intel GPU after v1.3.1 first model is qwen, please take a look on weight only quantization document