8 bit compression doesnt work for dolly

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Apache License 2.0

36.41k stars 4.48k forks source link

Closed PCIHD closed 10 months ago

PCIHD commented 1 year ago

Since there is no non fast version of gptneox dolly fails to load in 8 bit compression mode.

There needs to be a different way of handling the parameters, perhaps a method in the adapters that initializes the tokenizers differently.

merrymercy commented 1 year ago

cc @andy-yang-1

andy-yang-1 commented 1 year ago

@merrymercy @PCIHD The problem is fixed. See #1438

surak commented 10 months ago

This is fixed since june. Closing this one.