Open lucasjinreal opened 10 months ago
Just add this to the config.json
"pad_token_id": 0,
Just add this to the config.json
"pad_token_id": 0, Where is the config.json?
It's the config.json that should be part of your files: https://huggingface.co/TheBloke/CodeLlama-13B-Python-GPTQ/tree/main
So did anyone managed to get coherent sentences out of the model yet? It barely acknowledges my questions.
So did anyone managed to get coherent sentences out of the model yet? It barely acknowledges my questions.
I have tried the Phind-CodeLlama-34B on example-chatbot.py and output is really bad and repeats words endlessly. I have read that people have gotten it to work so maybe its an exllama issue idk. I am new to all of this
I also tried the new WizardCoder-Python-34B but it gives me this error: with safe_open(self.config.model_path, framework = "pt", device = "cpu") as f: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
WizardCoder-Python-34B works well for me. All the other TheBloke models seem defective.
I also tried the new WizardCoder-Python-34B but it gives me this error: with safe_open(self.config.model_path, framework = "pt", device = "cpu") as f: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
I fixed this issue by deleting the model and downloading it again. And i can confirm WizardCoder Python is the only one that work well so far for me
@dred0n I think you were right.
especially these quantized models. (might mainly caused by quantize).
Did u tested WizardCoder-34B with quantize and exllama??
@dred0n Hi, can u share your Wizardcoder34B quantized model? GPTQ?
@lucasjinreal Yes, it works well. I'm using TheBloke's WizardCoder-34B and the results are the same as like the Demo WizardLM put up.
@dred0n how about the quantized model? What inference framework used here? exllama or llama.cpp or hf?
exllama/model.py", line 45, in init self.pad_token_id = read_config["pad_token_id"] KeyError: 'pad_token_id'