Integrating multiple locally hosted LLMs using LiteLLM.

Test It

To test ShellGPT with ollama, follow these steps:

# Clone repository.
git clone https://github.com/TheR1D/shell_gpt.git
cd shell_gpt
# Change branch
git checkout ollama
# Create virtual environment
python -m venv venv
# Activate venv.
source venv/bin/activate
# Install dependencies
pip install -e  .

Ollama

[!NOTE] ShellGPT is not optimized for local models and may not work as expected.

Installation

MacOS

Download and launch Ollama app.

Linux & WSL2

curl https://ollama.ai/install.sh | sh

Setup

We can have multiple large language models installed in Ollama like Llama2, Mistral and others. It is recommended to use mistral:7b-instruct for the best results. To install the model, run the following command:

ollama pull mistral:7b-instruct

This will take some time to download the model and install it. Once the model is installed, you can start API server:

ollama serve

ShellGPT configuration

Now when we have Ollama backend running we need to configure ShellGPT to use it. Check if Ollama backend is running and accessible:

sgpt --model ollama/mistral:7b-instruct  "Who are you?"
# -> I'm ShellGPT, your OS and shell assistant...

If you are running ShellGPT for the first time, you will be prompted for OpenAI API key. Just press Enter to skip this step.

Now we need to change few settings in ~/.config/shell_gpt/.sgptrc. Open the file in your editor and change DEFAULT_MODEL to ollama/mistral:7b-instruct. Also make sure that OPENAI_USE_FUNCTIONS is set to false. And that's it! Now you can use ShellGPT with Ollama backend.

sgpt "Hello Ollama"

Azure

git clone https://github.com/TheR1D/shell_gpt.git
cd shell_gpt
git checkout ollama
python -m venv venv
source venv/bin/activate
pip install -e  .

export AZURE_API_KEY=YOUR_KEY
export AZURE_API_BASE=YOUR_API_BASE
export AZURE_API_VERSION=YOUR_API_VERSION
export AZURE_AD_TOKEN=YOUR_TOKEN # Optional
export AZURE_API_TYPE=YOUR_API_TYPE # Optional

sgpt --model azure/<your_deployment_name> --no-functions "Hi Azure"
# or
python -m sgpt --model azure/<your_deployment_name> --no-functions "Hi Azure"

Hi! - Let me begin by thanking you for this awesome project!

I'll test this more soon (also on WSL), but seems to work on Os X!

% sgpt --no-functions --model ollama/mixtral:latest  "Hi, Who are you?"
Hello! I'm ShellGPT, your programming and system administration assistant. I'm here to help you with any questions or tasks related to the Darwin/MacOS 14.2 operating system and the zsh shell. I aim to provide short and concise responses in about 100 words, using Markdown formatting when appropriate. If needed, I can store data from our conversation for future reference. How can I assist you today?

For now, two little remarks:

For me, your testing instructions don't actually replace the sgpt command. I usually use conda rather than venv and am not very familiar with venv, so might be my personal setup. I'm aware of two ways of running this:

straight from source:

python app.py --model ollama/mistral:7b-instruct  "Who are you?"

Just install it outside of the venv

# Clone repository.
git clone https://github.com/TheR1D/shell_gpt.git
cd shell_gpt
# Change branch
git checkout ollama
# install
pip install --upgrade .

I have functions installed, and it crashes unless I run it with --no-functions

shell_gpt % sgpt --model ollama/mixtral:latest  "Hi, Who are you?"
<...>
/opt/homebrew/anaconda3/lib/python3.11/site-packages/litellm/llms/ollama.py:246 in               │
│ ollama_completion_stream                                                                         │
│                                                                                                  │
│   243 │   │   │   for transformed_chunk in streamwrapper:                                        │
│   244 │   │   │   │   yield transformed_chunk                                                    │
│   245 │   │   except Exception as e:                                                             │
│ ❱ 246 │   │   │   raise e                                                                        │
│   247                                                                                            │
│   248                                                                                            │
│   249 async def ollama_async_streaming(url, data, model_response, encoding, logging_obj):        │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │        data = {                                                                              │ │
│ │               │   'model': 'mixtral:latest',                                                 │ │
│ │               │   'prompt': 'You are ShellGPT\nYou are programming and system administration │ │
│ │               assistant.\nYou ar'+1622,                                                      │ │
│ │               │   'options': {'format': 'json', 'temperature': 0.0, 'top_p': 1.0}            │ │
│ │               }                                                                              │ │
│ │ logging_obj = <litellm.utils.Logging object at 0x105f73e50>                                  │ │
│ │    response = <Response [400 Bad Request]>                                                   │ │
│ │         url = 'http://localhost:11434/api/generate'                                          │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /opt/homebrew/anaconda3/lib/python3.11/site-packages/litellm/llms/ollama.py:234 in               │
│ ollama_completion_stream                                                                         │
│                                                                                                  │
│   231 │   │   try:                                                                               │
│   232 │   │   │   if response.status_code != 200:                                                │
│   233 │   │   │   │   raise OllamaError(                                                         │
│ ❱ 234 │   │   │   │   │   status_code=response.status_code, message=response.text                │
│   235 │   │   │   │   )                                                                          │
│   236 │   │   │                                                                                  │
│   237 │   │   │   streamwrapper = litellm.CustomStreamWrapper(                                   │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │        data = {                                                                              │ │
│ │               │   'model': 'mixtral:latest',                                                 │ │
│ │               │   'prompt': 'You are ShellGPT\nYou are programming and system administration │ │
│ │               assistant.\nYou ar'+1622,                                                      │ │
│ │               │   'options': {'format': 'json', 'temperature': 0.0, 'top_p': 1.0}            │ │
│ │               }                                                                              │ │
│ │ logging_obj = <litellm.utils.Logging object at 0x105f73e50>                                  │ │
│ │    response = <Response [400 Bad Request]>                                                   │ │
│ │         url = 'http://localhost:11434/api/generate'                                          │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /opt/homebrew/anaconda3/lib/python3.11/site-packages/httpx/_models.py:574 in text                │
│                                                                                                  │
│    571 │   @property                                                                             │
│    572 │   def text(self) -> str:                                                                │
│    573 │   │   if not hasattr(self, "_text"):                                                    │
│ ❱  574 │   │   │   content = self.content                                                        │
│    575 │   │   │   if not content:                                                               │
│    576 │   │   │   │   self._text = ""                                                           │
│    577 │   │   │   else:                                                                         │
│                                                                                                  │
│ ╭────────────── locals ───────────────╮                                                          │
│ │ self = <Response [400 Bad Request]> │                                                          │
│ ╰─────────────────────────────────────╯                                                          │
│                                                                                                  │
│ /opt/homebrew/anaconda3/lib/python3.11/site-packages/httpx/_models.py:568 in content             │
│                                                                                                  │
│    565 │   @property                                                                             │
│    566 │   def content(self) -> bytes:                                                           │
│    567 │   │   if not hasattr(self, "_content"):                                                 │
│ ❱  568 │   │   │   raise ResponseNotRead()                                                       │
│    569 │   │   return self._content                                                              │
│    570 │                                                                                         │
│    571 │   @property                                                                             │
│                                                                                                  │
│ ╭────────────── locals ───────────────╮                                                          │
│ │ self = <Response [400 Bad Request]> │                                                          │
│ ╰─────────────────────────────────────╯                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

TheR1D / shell_gpt

Ollama integration and other backends #463