janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
https://jan.ai/
GNU Affero General Public License v3.0
23.74k stars 1.38k forks source link

bug: wrong max context with Command-R #4108

Closed Kep0a closed 1 day ago

Kep0a commented 1 day ago

Jan version

0.5.9

Describe the Bug

Locally imported Command-R 08-2024 GGUF has an incorrect maximum context length of 8192.

Command-R 08-2024 has a maximum context of 128k.

Potential solution: locally imported model inference settings should all be able to be manually set

Steps to Reproduce

  1. Sym-link c4ai-command-r-08-2024-Q5_K_M.gguf
  2. context length cannot be extended past 8192

Screenshots / Logs

Screenshot 2024-11-23 at 4 33 02 PM

What is your OS?

imtuyethan commented 1 day ago

@Kep0a For self-imported models, you'll need to configure the model settings yourself since Jan won't be able to automatically detect the optimal settings.

Screenshot 2024-11-24 at 7 49 53 PM

How to manually configure model parameters

  1. After importing your model, go to your Jan Data Folder (you can open it from Settings > Advanced Settings > Jan Data Folder)
  2. Navigate to models > imported folder
Screenshot 2024-11-24 at 7 52 56 PM
  1. Find your model's yaml/json file and adjust the settings there
  2. For context length specifically, look for max_tokens or ctx_len in the file
Screenshot 2024-11-24 at 7 53 40 PM

The file shows all available parameters you can tweak, with helpful comments about valid ranges (like temperature: 0-1, etc). Just be careful editing these values - make sure they match your model's actual capabilities!

Tip: For better visibility, you can open the file with VS Code or any text editor that supports YAML syntax highlighting.

Kep0a commented 1 day ago

Thanks @imtuyethan !