[Feature Request] Add `SmolLM` model and WebLLM

lightaime commented 3 months ago

[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[ ] Consider asking first in a Discussion.

No response

No response

No response

Wendong-Fan commented 2 months ago

Hey @lightaime , this model is supported by ollama, should we do native integration? refer: https://ollama.com/library/smollm

koch3092 commented 2 months ago

The difference between WebLLM and LLM Web APP:

Note: Ignore browser dependency on GPU

koch3092 commented 2 months ago

Here are the key features of WebLLM :

WebLLM leverages the WebGPU on the user's local machine for hardware acceleration, enabling high-performance language model inference directly in the browser. This removes server dependencies, reduces costs, enhances privacy and personalization, all while lowering operational expenses**.
WebLLM natively supports a wide range of popular models, including Llama, Hermes, Phi, Gemma, RedPajama, Mistral, SmolLM and Qwen**, making it adaptable for various tasks.
It is fully compatible with the OpenAI API, offering features like JSON mode, function calling, and streaming output, simplifying integration for developers.
WebLLM allows the integration of custom models in MLC format to meet specific needs, enhancing flexibility in model deployment.
As a standalone package, WebLLM can be quickly integrated into projects via NPM, Yarn, or CDN, and comes with comprehensive examples and a modular design that makes it easy to connect with UI components.
It supports streaming output and real-time interaction, making it suitable for applications like chatbots and virtual assistants.
By delegating computational tasks to Web Workers or Service Workers, WebLLM optimizes UI performance and efficiently manages the lifecycle of models.
It provides examples for building Chrome extensions, allowing users to extend browser functionalities with ease.

camel-ai / camel