Closed flatsiedatsie closed 6 months ago
Hey, @flatsiedatsie, could it be related to the prompt you're using?
I tried it and got a good response.
Here's the prompt I used:
<|im_start|>user
Explain quantum computing like I'm five<|im_end|>
<|im_start|>assistant
Could you try again with this one?
Console info, for reference:
@felladrin Thank you for having a look. I didn't have time to look into details, but seems like Qwen models are quite sensitive to chat templates (due to their small size - there is no room for errors)
Please let me know if that works for you @flatsiedatsie
Thanks for testing on your end.
I managed to get output once, but only once.
I'm using the tokenizer from Transformers.js to generate the prompts. There was an issue with that, but it was fixed a while ago as far as I can tell. This process uses Jinja2 templates which are stored on HuggingFace.
tokenizer = await AutoTokenizer.from_pretrained(config_url);
return tokenizer.apply_chat_template(messages, {tokenize:false, return_tensor:false, add_generation_prompt:true});
You hint about the sensitivity to the prompt is very useful though. I'm doing some tests now.
It's working now. I'm not even sure why :-D
I noticed something interesting where this tiny model returns an empty string whenever I query it:
https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat-GGUF/resolve/main/qwen1_5-0_5b-chat-q4_0.gguf