[Feature Request]: Add quantized qwen2-0.5b

mlc-ai / web-llm-chat

Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.

https://chat.webllm.ai/

Apache License 2.0

380 stars 58 forks source link

[Feature Request]: Add quantized qwen2-0.5b #41

Closed bil-ash closed 4 months ago

bil-ash commented 4 months ago

Problem Description

My android phone has limited RAM and so it is able to run only the Tinyllama model. However, Tinyllama provides inferior result compared to Qwen2-0.5b instruct(tested on desktop). Although, Qwen2 0.5 B has fewer params, I am unable to run it on phone because the llm-chat has only the unquantized version of Qwen2-0.5B while having the quantized version of Tinlllama.

Solution Description

Please add Qwen2-0.5B quantized versions(q4f16 anf q4f32) to the list of supported models in web-llm-chat. These two are already available in huggingface.

Alternatives Considered

No response

Additional Context

No response

Neet-Nestor commented 4 months ago

cc. @CharlieFRuan

Issues-translate-bot commented 4 months ago

Bot detected the issue body's language is not English, translate it automatically.

cc. @CharlieFRuan

bil-ash commented 4 months ago

@CharlieFRuan Please do the needful

Neet-Nestor commented 4 months ago

@bil-ash We will work on this. Meanwhile, please feel free to use MLC-LLM which supports qwen2-0.5b quantized versions and connect WebLLM Chat to its serve API as a temporary alternative solution.

Instruction: https://github.com/mlc-ai/web-llm-chat/?tab=readme-ov-file#use-custom-models

bil-ash commented 4 months ago

@bil-ash We will work on this. Meanwhile, please feel free to use MLC-LLM which supports qwen2-0.5b quantized versions and connect WebLLM Chat to its serve API as a temporary alternative solution.

Instruction: https://github.com/mlc-ai/web-llm-chat/?tab=readme-ov-file#use-custom-models

Created a PR to solve the issue. Please have a look.

Neet-Nestor commented 4 months ago

The model is available on WebLLM Chat now. https://chat.webllm.ai/#/chat

Thanks for the contribution!