Open maziyarpanahi opened 7 months ago
Maybe a relevant conversation: https://huggingface.co/databricks/dbrx-instruct/discussions/10
We published a 4 bit bnb version here: https://huggingface.co/PrunaAI/dbrx-base-bnb-4bit :)
Thanks @johnrachwan123 - So is the current bitsandbytes compatible with DBRX?
The model weights didn't use nn.linear so its not an out of the box solution. There are models out there that have been converted that work right away.
I was able to get this model loaded with bitsandbytes and while I didn't try a generation, I was able to train the model a bit and get a decreasing loss.
SinclairSchneider/dbrx-base-quantization-fixed
Some features do not work, like gradient checkpointing, but I think its good enough for now until its officially supported
Feature request
Support for DBRX Instruct model in bitsandbytes
Motivation
DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this
But it loads the model fully. (maybe I am missing something)
Your contribution
I am willing to test any PR