Open dfadev opened 3 weeks ago
Sound like a matter of conditional renaming in https://github.com/madox2/vim-ai/blob/758be522e6d765eeb78ce7681f4b39e3b05043b8/py/utils.py#L53 One could think of simply passing the options declared in Vim to be more flexible to API changes of the many models.
Yeah it's the stream: false
part that is tripping me up. It's not as simple as just changing the option, there appear to be logic changes needed as well because in non-stream mode you have to poll to get the completion.
I have just prototyped non streaming support on the branch: support-non-streaming, but I didn't have time to properly test. Let me know if that works for you
nice! I had to do this:
diff --git a/py/utils.py b/py/utils.py
index 381ba75..4869398 100644
--- a/py/utils.py
+++ b/py/utils.py
@@ -51,12 +51,17 @@ def normalize_config(config):
def make_openai_options(options):
max_tokens = int(options['max_tokens'])
- return {
+ max_completion_tokens = int(options['max_completion_tokens'])
+ result = {
'model': options['model'],
- 'max_tokens': max_tokens if max_tokens > 0 else None,
'temperature': float(options['temperature']),
'stream': int(options['stream']) == 1,
}
+ if max_tokens > 0:
+ result['max_tokens'] = max_tokens
+ if max_completion_tokens > 0:
+ result['max_completion_tokens'] = max_completion_tokens
+ return result
def make_http_options(options):
return {
because it didn't want max_tokens
any more, instead o1
requires max_completion_tokens
. It complains even with null
for max_tokens
.
Also needed to set the initial prompt to >>> user
instead of >>> system
and set temperature
to 1
in init.vim
:
let initial_prompt =<< trim END
>>> user
You are a completion engine with following parameters:
Task: Provide compact code/text completion, generation, transformation or explanation
Topic: general programming and text editing
Style: Plain result without any commentary, unless commentary is necessary. Don't use semicolons for javascript.
Audience: Users of text editor and programmers that need to transform/generate text
END
let chat_engine_config = {
\ "engine": "chat",
\ "options": {
\ "stream": 0,
\ "model": "o1-mini",
\ "max_tokens": 0,
\ "max_completion_tokens": 25000,
\ "temperature": 1,
\ "request_timeout": 120,
\ "selection_boundary": "",
\ "initial_prompt": initial_prompt,
\ },
\ "ui": {
\ "open_chat_command": "preset_below",
\ "scratch_buffer_keep_open": 0,
\ "populate_options": 0,
\ "code_syntax_enabled": 1,
\ "paste_mode": 1,
\ "show_initial_prompt": 1,
\ },
\}
sample output:
>>> user
how many R's in strawberry?
<<< assistant
There are three **R**'s in "strawberry".
sample o1-mini
response:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "There are three **R**'s in \"strawberry\".",
"refusal": null,
"role": "assistant"
}
}
],
"created": 1728788354,
"id": "chatcmpl-AHj74vV1jBJoNRMWk6dSOaBHl0jlT",
"model": "o1-mini-2024-09-12",
"object": "chat.completion",
"system_fingerprint": "fp_692002f015",
"usage": {
"completion_tokens": 601,
"completion_tokens_details": {
"reasoning_tokens": 576
},
"prompt_tokens": 98,
"prompt_tokens_details": {
"cached_tokens": 0
},
"total_tokens": 699
}
}
o1-preview
also works:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "3",
"refusal": null,
"role": "assistant"
}
}
],
"created": 1728788678,
"id": "chatcmpl-AHjCIeW5SzrPYWYV2nuVxzVVLuIgU",
"model": "o1-preview-2024-09-12",
"object": "chat.completion",
"system_fingerprint": "fp_49f580698f",
"usage": {
"completion_tokens": 1355,
"completion_tokens_details": {
"reasoning_tokens": 1344
},
"prompt_tokens": 102,
"prompt_tokens_details": {
"cached_tokens": 0
},
"total_tokens": 1457
}
}
some notes for you re: the o1 models and vim-ai
temperature: 1
system
rolemax_tokens: null
stream: true
First three I can bypass, but non-stream mode is a stopper.
I think openai will add this support later, so could just wait for them.