mlc-ai / web-llm

High-performance In-browser LLM Inference Engine
https://webllm.mlc.ai
Apache License 2.0
12.35k stars 780 forks source link

[Tracking] WebLLM: OpenAI-Compatible APIs in ChatModule #276

Closed CharlieFRuan closed 3 months ago

CharlieFRuan commented 7 months ago

Overview

The goal of this task is to implement APIs that are OpenAI API compatible. Existing APIs like generate() will still be kept. Essentially we want JSON-in and JSON-out, resulting in a UI like:

import * as webllm from "@mlc-ai/web-llm";

async function main() {
  const chat = new webllm.ChatModule();
    await chat.reload("Llama-2-7b-chat-hf-q4f32_1");

  const completion = await chat.chat_completion({
    messages: [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Hello!" }
    ],
    // optional generative configs here
  });

  console.log(completion.choices[0]);
}

main();

If streaming:

  const completion = await chat.chat_completion({
    messages: [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Hello!" }
    ],
    stream = true,
    // optional generative configs here
  });

  for await (const chunk of completion) {
    console.log(chunk.choices[0].delta.content);
  }

Action items

Existing gaps

There are some fields/features that are not yet supported in WebLLM compared to OpenAI's openai-node.

Fields in ChatCompletionRequest

Fields in ChatCompletion respond

Others

Future Items

Kartik14 commented 7 months ago

@CharlieFRuan Thanks for creating the tracking issue. Just wanted to let you know that @shreygupta2809 and I are currently working on supporting the function calling