lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.41k stars 4.48k forks source link

8 bit compression doesnt work for dolly #1409

Closed PCIHD closed 10 months ago

PCIHD commented 1 year ago

image

Since there is no non fast version of gptneox dolly fails to load in 8 bit compression mode.

There needs to be a different way of handling the parameters, perhaps a method in the adapters that initializes the tokenizers differently.

merrymercy commented 1 year ago

cc @andy-yang-1

andy-yang-1 commented 1 year ago

@merrymercy @PCIHD The problem is fixed. See #1438

surak commented 10 months ago

This is fixed since june. Closing this one.