hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 80 forks source link

Vicuna problem #160

Closed zhound420 closed 1 year ago

zhound420 commented 1 year ago

Has anyone got this model to work yet? Running into this:

OSError: anon8231489123/vicuna-13b-GPTQ-4bit-128g does not appear to have a file named pytorch_model-00001-of-00003.bin. Checkout 'https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g/main' for available files.

jota2rz commented 1 year ago

I don't think basaran supports GPTQ pre-quantized models. https://github.com/oobabooga/text-generation-webui supports this model. Documentation at https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-(4-bit-mode)

Feature request? 👀

peakji commented 1 year ago

Basaran should work with Vicuna models. The model repo seems to contain outdated configs that point to non-existing weight files: https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g/discussions/15

Also, you may want to install safetensors, as the repo only provides weights in safetensors format.

jota2rz commented 1 year ago

Basaran should work with Vicuna models.

Do you know how to make it work?

I get this error.

ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a `tokenizers` library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
jota2rz commented 1 year ago

Oops, I forgot to install the extra dependencies since I was inside a venv. I needed transformers, sentenpiece and safetensors. pip install safetensors transformers[sentencepiece]

Works good!

fardeon commented 1 year ago

We will add safetensors support in the next release: https://github.com/hyperonym/basaran/pull/174 https://github.com/hyperonym/basaran/pull/175

zhound420 commented 1 year ago

Oops, I forgot to install the extra dependencies since I was inside a venv. I needed transformers, sentenpiece and safetensors. pip install safetensors transformers[sentencepiece]

Works good!

Hey I'd appreciate it if you could help me out running this model with basaran. Maybe point me to the right huggingface repository? Thanks.

jota2rz commented 1 year ago

@zhound420 https://rentry.org/nur779

zhound420 commented 1 year ago

@zhound420 https://rentry.org/nur779

Thank you, you rock.

karfly commented 1 year ago

@fardeon @peakji

Hi, guys! In the end, I did not understand whether GPTQ 4bit models are supported or not?

karfly commented 1 year ago

@zhound420 did you manage to run GPTQ model?

zhound420 commented 1 year ago

@karfly no I did not yet. I'll have to come back to it in a couple days.

karfly commented 1 year ago

@zhound420 looking forward to hear from you!

advaitdeshmukh commented 1 year ago

@karfly no I did not yet. I'll have to come back to it in a couple days.

Did you manage to do it? Kinda stuck on the same.

Edit: I'm trying to use it with docker image(1st option)