intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.13k stars 211 forks source link

is adapt_transformers class adapt any transformer or HF models? #1666

Closed sleepingcat4 closed 3 months ago

sleepingcat4 commented 3 months ago

I went through the habana code inside this repo and how Intel uses a cheatcode method to adapt any transformer modules to run on Intel gaudi/gaudi2 hardware. I didn't read the underlying code written by Habana devs but I was curious can it in theory transform any transformer based model including those which are written for cuda in mind?

Overlook my ignorance but I don't think Intel offers any native framework to develop architecture or models that leverages their hardware and fine-tuned for it. Almost all of its hardware is inference based rather than training + inference both in mind. That's why, I was wondering if it can run insanely fast whisper or faster whisper alike models faster or equal speed of Nvidia A100/H100/H200?

https://huggingface.co/Systran/faster-whisper-medium

a32543254 commented 3 months ago

please raise your issue in habana repo