Enhance Twinny with LiteLLM (and indirectly OpenRouter) Support

bvelker commented 6 months ago

Feature Request: Integrate LiteLLM with the Twinny Visual Studio Code plugin to enrich AI code completion capabilities. This integration aims to leverage the wide variety of models and APIs available through LiteLLM, enhancing the plugin's functionality and user experience.

Current Issue: Attempting to use LiteLLM Proxy results in an APIConnectionError because of an unexpected 'prompt' keyword argument, indicating a misalignment between the expected input by LiteLLM and the request data sent from Twinny.

Solution: Modify Twinny to accommodate the input structure required by LiteLLM Proxy, specifically by addressing how the 'prompt' key is handled.

Objective: The primary goal is to augment Twinny’s AI code completion service by enabling efficient access to an expanded set of models.

Action: Request implementation of this compatibility feature and encourage contributions towards integrating LiteLLM (and indirectly OpenRouter) support into Twinny. This initiative could significantly amplify the plugin's utility and adoption.

rjmacarthy commented 6 months ago

Hey, thanks for the report. I added another provider other and removed prompt from payload. Please report back.

Many thanks,

bvelker commented 6 months ago

it's closer We appear to have another request formatting issue for follow up chat responses

12:58:53 - LiteLLM Router:INFO: router.py:472 - litellm.acompletion(model=openrouter/openai/gpt-3.5-turbo-0125) 200 OK 12:58:53 - LiteLLM Router:DEBUG: router.py:1144 - Async Response: <litellm.utils.CustomStreamWrapper object at 0x119e19150> INFO: 127.0.0.1:51575 - "POST /chat/completions HTTP/1.1" 200 OK 12:58:53 - LiteLLM Proxy:DEBUG: proxy_server.py:2592 - inside generator {'error': {'message': '{\n "error": {\n "message": "Additional properties are not allowed (\'type\' was unexpected) - \'messages.1\'",\n "type": "invalid_request_error",\n "param": null,\n "code": null\n }\n}\n', 'code': 400}} Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 2595, in async_data_generator async for chunk in response: File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/litellm/utils.py", line 9816, in anext raise e File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/litellm/utils.py", line 9700, in anext async for chunk in self.completion_stream: File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_streaming.py", line 117, in aiter async for item in self._iterator: File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/openai/_streaming.py", line 138, in stream raise APIError( openai.APIError: An error occurred during streaming

the first chat response always works fine. This is just a problem with follow up chats.

to replicate: litellm --port 12121 --detailed_debug --debug -c ./proxy_server_config.yaml

proxy_server_config.yaml model_list:

model_name: gpt-3.5-turbo litellm_params: model: openrouter/openai/gpt-3.5-turbo-0125 api_base: https://openrouter.ai/api/v1 api_key: # LiteLLM let's you use arbitrary api services. I just happen to use openrouter

litellm_settings: drop_params: True max_budget: 100 budget_duration: 30d num_retries: 0 request_timeout: 600 telemetry: False

general_settings: proxy_budget_rescheduler_min_time: 60 proxy_budget_rescheduler_max_time: 64 proxy_batch_write_at: 1

vscode twinny settings "twinny.apiProvider": "other" "twinny.chatApiPath": "/chat/completions" "twinny.chatApiPort": 12121 "twinny.chatModelName": "gpt-3.5-turbo"

should we update Twinny: Chat Model Name "Model identifier for chat completions. Applicable only for Ollama and Oobabooga API." text?

Twinny: Fim Template Format - add "other" as an option. Actually I suggest using LiteLLM instead as that will guide users better

LiteLLM is a prompt structure translation layer for arbitrary api services using OpenAI prompting format, so in theory if Twinny can structure llm calls in Openai Format, it'll get all LiteLLM supported api's for free

rjmacarthy commented 6 months ago

I think I fixed it in 3.9.3, please check. By the way, please give details of how to make FIM request using liteLLM for using OpenAI and Ollama codellama. Thanks.

Edit: Hey, I added support for LiteLLM in 3.10.0 please check and report back. I am unsure about FIM template and LiteLLM still.

Thanks.

rjmacarthy commented 6 months ago

LiteLLM is now supported.

twinnydotdev / twinny

Enhance Twinny with LiteLLM (and indirectly OpenRouter) Support #190