IBM / text-generation-inference

IBM development fork of https://github.com/huggingface/text-generation-inference
Apache License 2.0
52 stars 30 forks source link

🔥 Remove our exllama code because we use auto-gptq vendored kernels #59

Closed tjohnson31415 closed 6 months ago

tjohnson31415 commented 6 months ago

Motivation

We recently found that AutoGPTQ vendors its own versions of exllama and exllamav2 kernels in augotgptq_extension that are installed with the library. Since we install AutoGPTQ after we installed our own builds of the exllama kernels, the AutoGPTQ ones overwrite our copies. So it turns out that we don't need to vendor and compile our own exllama kernels.

Modifications

Removes the vendored copies of exllama kernels.

Result

There should be no functional changes other than faster build times and less code.