Add support for other LLM services

yaroslavyaroslav commented 9 months ago

There's a few competitive services released their API just yet.

yigitkonur commented 9 months ago

An IDE is needed for Prompt Engineers. Or someone like you will take the initiative, enhance it with a Sublime Text Plugin, and bring this feature to Sublime Text, which would be truly wonderful for all of us. I hope you never lose your motivation for the project, I will become a sponsor as soon as possible - I sincerely thank you for your contribution.

ishaan-jaff commented 8 months ago

Hi @yaroslavyaroslav @yigitkonur - I believe we can make this easier I’m the maintainer of LiteLLM - we allow you to deploy a LLM proxy to call 100+ LLMs in 1 format - PaLM, Bedrock, OpenAI, Anthropic etc https://github.com/BerriAI/litellm/tree/main/openai-proxy.

If this looks useful (we're used in production)- please let me know how we can help.

Usage

PaLM request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "palm/chat-bison",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

gpt-3.5-turbo request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

claude-2 request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "claude-2",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

yaroslavyaroslav commented 8 months ago

@ishaan-jaff wow, thanks for highlighting this. Indeed this could be a simpler solution, than implementing all of those by my own. Even though there're some caveats on Sublime Text side we have here, e.g. there's a hardly limited list of dependency we can rely on within the plugin code. Though I saw such plugins that was rely on complete third party solution, like some node.js running code, so this should be solvable.

Just a few question I have here, that I hope would save some time on both sides:

Is it complete cross platform in its un-contained run, or there are pits and falls to overcome to make it working both windows, linux and macOS?
Do I get it right that this is nothing but a local server that manages all the network by its own based on request's content it receive, i.e. model field? I dropped a quick look into the docs, just want to ensure this moment specifically.

ishaan-jaff commented 8 months ago

yes the proxy is cross platform
yes it's a proxy server that allows you to call all LLMs in one format. You can choose to deploy it or use it locally

yaroslavyaroslav commented 8 months ago

@ishaan-jaff Thanks, I'll consider it in depth when I'll come closer to implementing this one.

james2doyle commented 8 months ago

It would be cool if this plugin supported Ollama. You can run it locally as a standalone server, and make API calls to it: https://github.com/jmorganca/ollama/blob/main/docs/api.md

ishaan-jaff commented 8 months ago

Hi @james2doyle litellm already supports ollama

james2doyle commented 8 months ago

@ishaan-jaff Oh nice. I misunderstood what litellm was, I thought it was a hosted service

ishaan-jaff commented 8 months ago

no worries - litellm is a python package to call 100+ LLMs in the same I/O format. We also offer a proxy server if you don't want to make code changes to your app

yaroslavyaroslav commented 8 months ago

FYI: for now there's no option to use arbitrary dependency within the ST plug-in, but TIL that package control 4.0 beta allows just this, regarding its current state I believe it would be released within a quarter of something.

So all work in regards of this task will be started right after that release, as well as some other missed features, like precise tokens count.

UPD: I found PC 4.0 beta project, so I believe that it's quite far from being released in next quarter.

yaroslavyaroslav commented 7 months ago

The good news is that PC 4.0.0 released has been just yet, which means soon it'll be possible to add support for a custom python libraries into the package. We're not there yet, coz packagecontrol.io itself still doesn't fully supports 4.0.0 scheme (e.g. arbitrary libraries as dependencies), but I believe that it would take about a month or two to make it happen.

rubjo commented 4 months ago

Any news on this? Per now using https://github.com/icebaker/nano-bots-api via https://github.com/icebaker/sublime-nano-bots to talk to Ollama / Mistral locally. Which works, but would like to see if something using this could be better.

yaroslavyaroslav commented 4 months ago

Nope unfortunately, every time I tried any local llm as an assistant for my language of interest I noticed a way below gpt4 suggestions quality. More than that the same picture I'm observing with the all competing services as perplexity.

So honestly I have no plans to implement this until things changes.

Peeking at the very last bard/geminy 1m context window though. Maybe it'll worth it.

Aiq0 commented 4 months ago

Any news on this? Per now using https://github.com/icebaker/nano-bots-api via https://github.com/icebaker/sublime-nano-bots to talk to Ollama / Mistral locally. Which works, but would like to see if something using this could be better.

I was able to make Ollama working by some small changes, as Ollama API is compatible with OpenAI API:

instal package manually, so to make changes to it
change request in openai_network_client.py, to use ~~self.connection = HTTPClient('localhost:11434')~~ self.connection = HTTPConnection('localhost:11434') instead of logic present there
added dummy token "ollama-dummy-longer-than-10-characters" (or remove token checking)

change models in assitants:

"assistants": [
{
    "assistant_role": "Apply the change requested by the user to the code with respect to senior knowledge of programming",
    "chat_model": "codellama",
    "max_tokens": 4000,
    "name": "Replace",
    "prompt_mode": "replace"
},
{
    "assistant_role": "Insert code or whatever user will request with the following command instead of placeholder with respect to senior knowledge of programming",
    "chat_model": "codellama",
    "max_tokens": 4000,
    "name": "Insert",
    "prompt_mode": "insert",
    "placeholder": "## placeholder"
},
{
    "assistant_role": "Append code or whatever user will request with the following command instead of placeholder with respect to senior knowledge of programming",
    "chat_model": "codellama",
    "max_tokens": 4000,
    "name": "Append",
    "prompt_mode": "append"
},
]

So it will work nicely, if there would be config options to:

toggle between HTTP and HTTPS
change URL
do not require token, when url is not api.openai.com (or just advice to add some dummy token)

rubjo commented 4 months ago

@Aiq0 Confirmed working, thank you! (Used HTTPConnection, not HTTPClient)

Aiq0 commented 4 months ago

@Aiq0 Confirmed working, thank you! (Used HTTPConnection, not HTTPClient)

You are welcome. (sorry, that was typo)

yaroslavyaroslav commented 4 months ago

@rubjo @Aiq0 glad to hear it folks!

It would be just awesome if you'd do an extra mile and perform a PR with such functionality. The network layer code is kinda the least confusing piece through the whole code base, so I believe it could be taken without paying too much effort for that.

Aiq0 commented 4 months ago

@rubjo @Aiq0 glad to hear it folks!

It would be just awesome if you'd do an extra mile and perform a PR with such functionality. The network layer code is kinda the least confusing piece through the whole code base, so I believe it could be taken without paying too much effort for that.

OK, I am going to add some config settings for tweaking connection and create PR (most likely tomorrow). Is there anything other that should be considered?

yaroslavyaroslav commented 4 months ago

I believe not much. Just please try to avoid to overcomplicating things. Like to not present an additional settings if they can be avoided (e.g. I believe that it's perfectly fine to come along with dummy token on the user side for a local models rather than providing a separate toggle for that).

If you're about to add some global settings options, please consider to do so on the first level if it's possible.

A few words dropped in Readme about this new feature would definitely worth it either.

yaroslavyaroslav / OpenAI-sublime-text

Add support for other LLM services #29

Usage