Closed PieBru closed 4 months ago
Hi PIero!
We currently support setting the endpoint through the URL plus completion path:
name: openai
api_key: ""
model: gpt-4-turbo-preview
max_tokens: 8192
role: You are a helpful assistant.
temperature: 1
top_p: 1
frequency_penalty: 0
presence_penalty: 0
thread: personal
omit_history: false
url: https://api.openai.com
completions_path: /v1/chat/completions
models_path: /v1/models
auth_header: Authorization
auth_token_prefix: 'Bearer '
These can be set either through ~/.chatgpt-cli/config.yaml
or through environment variables (ie. OPENAI_URL
).
Please let me know if FOSS is working for you! I'm curious.
Good news, it works in Arch Linux with ollama-cuda
and these very basic parameters into ~/.chatgpt-cli/config.yaml
:
name: ollama
api_key: "sk-..."
model: mistral
max_tokens: 2048
role: You are a helpful assistant.
temperature: 1
top_p: 1
frequency_penalty: 0
presence_penalty: 0
thread: personal
omit_history: false
url: "http://localhost:11434"
completions_path: /v1/chat/completions
models_path: /v1/models
auth_header: Authorization
auth_token_prefix: 'Bearer '
I prudentially lowered max_tokens
, there are opensource models supporting 128K and more.
BTW, ollama now serves also an OpenAI API endpoint, and can self-.host any GGUF model in RAM+GPU, so there are plenty of choices out there.
Thank you, Piero
That's great! Thanks for circling back. Happy to hear that it's working. I will look into ollama
. Cheers.
Looking at their github stars progression, FOSS servers are seeing a fast grow. Supporting them will broaden the usage of this tool. Thank you, Piero