mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.24k stars 1.58k forks source link

[Model Request] OpenChat 3.5 #1776

Open imoneoi opened 9 months ago

imoneoi commented 9 months ago

⚙️ Request New Models

Additional context

OpenChat 3.5 models are the best-performing 7B chat models on Chatbot Arena and HumanEval+. They share the same architecture as Mistral, but use a different conversation template as follows (no system message needed):

GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:

Starling-LM-7B-alpha, OpenChat-3.5-0106 share the same conversation template and architecture.

MikeLP commented 8 months ago

Actually it works well even now with the current new flow. It just needs in few adjustments. Because it's just tuned mistral architecture, and it's supported by mlc.

Here is my shell script for Intel Mac for convertation:

#!/bin/bash

name="openchat-3.5-0106"

model_dir="./models/${name}/" # your path to the model, this is just example
quantization="q4f16_1"
conv_template="gpt2"
output_dir="models/converted/${name}-${quantization}-MLC/"
device_metal="metal"
device_metal_x86_64="metal:x86-64"

output_lib_metal="${output_dir}/${name}-${quantization}-metal.so"
output_lib_metal_x86_64="${output_dir}/${name}-${quantization}-metal_x86_64.dylib"

# Convert weights
mlc_chat convert_weight ${model_dir} \
    --quantization ${quantization} \
    -o ${output_dir}

# Generate config
mlc_chat gen_config ${model_dir} \
    --quantization ${quantization} \
    --conv-template ${conv_template} \
    -o ${output_dir}

# Compile for m1 (or any other device)
mlc_chat compile ${output_dir}mlc-chat-config.json \
    --device ${device_metal} \
    -o ${output_lib_metal}

# Compile for intel mac
mlc_chat compile ${output_dir}mlc-chat-config.json \
    --device ${device_metal_x86_64} \
    -o ${output_lib_metal_x86_64}

Then you need to adjust the conversation config.

## Openchat config
conv_config = ConvConfig(
    stop_str="<|end_of_turn|>",
    stop_tokens=[32000],
    separator_style=0,
    seps= [
      "<|end_of_turn|>"
    ],
    system="Your system prompt",
)

chat_config = ChatConfig(conv_config=conv_config,)

cm = ChatModule(chat_config=chat_config, ...)

P.S.

So I believe we just need to update https://github.com/mlc-ai/mlc-llm/blob/main/cpp/conv_templates.cc file to support one more config only.