Closed vackosar closed 1 month ago
I had to do some weird stuff like:
# Move inputs to the same device as the model
device = next(model.parameters()).device
inputs = {k: v.to(device) for k, v in inputs.items()}
Seems the AWQ part of this repo is workign again.
I wanted to quantize
model_name = "cognitivecomputations/dolphin-2.9.4-llama3.1-8b"
But i am getting an error:Output
How to fix?