OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
663 stars 50 forks source link

falcon 180B generates garbage on A100 #14

Closed githubpradeep closed 11 months ago

githubpradeep commented 11 months ago

i tried running the falcon notebook provided and it didnt generate any coherent sentences it just generated junk characters

ChenMnZ commented 11 months ago

Did you install autogptq from https://github.com/ChenMnZ/AutoGPTQ-bugfix?

git clone https://github.com/ChenMnZ/AutoGPTQ-bugfix
pip install -v .
githubpradeep commented 11 months ago

I tried installing the above autogptq but it kept failing installation

ChenMnZ commented 11 months ago

The original cuda kernel of autogpt have some bugs, so you should struggle to install it through the bug-fixed version (https://github.com/ChenMnZ/AutoGPTQ-bugfix), then you can obtain correct outputs.

githubpradeep commented 11 months ago

ok i tried it using a smaller model and it worked after installing

BaohaoLiao commented 3 months ago

@ChenMnZ May I ask what is the bug that you fix? Could you refer to the specific modified file?