janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
https://jan.ai/
GNU Affero General Public License v3.0
23.24k stars 1.35k forks source link

feat: User can use Stop Words presets and add custom #3536

Open dan-homebrew opened 2 months ago

dan-homebrew commented 2 months ago

Problem

Solution

From technical aspect, there are 2 cases:

Format

Most of model introduce stop tokens with this format <{content}> . The content is different for each model arch. So I think we can predefine a list option of map model's arch : list stop words like this for user to choses:

Maybe 5 or 6 popular models arch is enough and another option to let users input whatever they want (this feature may be only for power user or dev because normal user might only use default configuration)

Design

Figma: https://www.figma.com/design/DYfpMhf8qiSReKvYooBgDV/Jan-App-(3rd-version)?node-id=8281-97234&t=qb7yU8r2PAayVdNW-4

Default Stop Words:

Predefined Options:

Offer a dropdown preset or quick-select for common stop words based on model architecture

Image

Custom/User-Added Stop Words:


Image

Task

dan-homebrew commented 2 months ago

@nguyenhoangthuan99 however there are some clarifications needed from Inference team. What is the stop token format?

<|special_token|>
<|end_of_text|>
<|eom_id|>
nguyenhoangthuan99 commented 2 months ago

Stop words of a model can be a list so I think we can make it like this image

From technical aspect, I think there are 2 cases we can follow:

Format Most of model introduce stop tokens with this format <{content}> . The content is different for each model arch. So I think we can predefine a list option of map model's arch : list stop words like this for user to choses:

Maybe 5 or 6 popular models arch is enough and another option to let users input whatever they want (this feature may be only for power user or dev because normal user might only use default configuration)