Ollama is a better LLM server for local

iplayfast commented 10 months ago

Is your feature request related to a problem? Please describe.

I'm using ollama for many things, running lm-studio for this seems wrong as it only runs as an app image.

Describe the solution you'd like

Directly support ollama 127.0.0.1:11434 from langchain import ollama

Describe alternatives you've considered

lm-studio is ok, but ollama is better.

Additional context

No response

Notnaton commented 10 months ago

You can use ollama with

Interpreter --api_base http://localhost:port/v1

mak448a commented 10 months ago

You can use ollama with
Interpreter --api_base http://localhost:port/v1 
I tried interpreter --api_base http://localhost:11434/v1 --vision --local but it didn't work. Any ideas? Exception: Error occurred. OpenAIException - 404 page not found

hidek84 commented 10 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm==1.16.7 works for me. (You need the ollama/ prefix in the --model argument)

interpreter --model ollama/codellama:7b

richstav commented 10 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm=1.16.7 works for me. (You need the ollama/ prefix in the --model argument)
interpreter --model ollama/codellama:7b

Perfect timing! Same issue as op, this solved it.

mak448a commented 10 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm=1.16.7 works for me. (You need the ollama/ prefix in the --model argument)
interpreter --model ollama/codellama:7b

Thanks! I remembered something like this but I forgot what it was exactly.

Kreijstal commented 10 months ago

does not run in windows, pretty useless

adriens commented 10 months ago

I could not make it run on my linux:

❯ interpreter --model ollama/codellama:7b

▌ Model set to ollama/codellama:7b                                            

Open Interpreter will require approval before running code.                     

Use interpreter -y to bypass this.                                              

Press CTRL-C to exit.                                                           

> hello

We were unable to determine the context window of this model. Defaulting to     
3000.                                                                           

If your model can handle more, run interpreter --context_window {token limit}   
--max_tokens {max tokens per response}.                                         

Continuing...                                                                   

        Python Version: 3.11.7
        Pip Version: 23.3.1
        Open-interpreter Version: cmd:Interpreter, pkg: 0.2.0
        OS Version and Architecture: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
        CPU Info: x86_64
        RAM Info: 7.57 GB, used: 2.12, free: 0.78

        # Interpreter Info

        Vision: False
        Model: ollama/codellama:7b
        Function calling: None
        Context window: None
        Max tokens: None

        Auto run: False
        API base: None
        Offline: False

        Curl output: Not local

        # Messages

        System Message: You are Open Interpreter, a world-class programmer that can complete any goal by executing code.
First, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it).
When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code.
If you want to send data between programming languages, save the data to a txt or json.
You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again.
You can install new packages.
When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in.
Write messages to the user in Markdown.
In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time) **it's critical not to try to do everything in one code block.** You should try something, print information about it, then continue from there in tiny, informed steps. You will never get it on the first try, and attempting it in one go will often lead to errors you cant see.
You are capable of **any** task.

        {'role': 'user', 'type': 'message', 'content': 'hello'}

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 221, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 249, in ollama_completion_stream
    raise e
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 237, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 574, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 568, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/bin/interpreter", line 8, in <module>
    sys.exit(interpreter.start_terminal_interface())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 25, in start_terminal_interface
    start_terminal_interface(self)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/terminal_interface/start_terminal_interface.py", line 684, in start_terminal_interface
    interpreter.chat()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 86, in chat
    for _ in self._streaming_chat(message=message, display=display):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 113, in _streaming_chat
    yield from terminal_interface(self, message)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/terminal_interface/terminal_interface.py", line 135, in terminal_interface
    for chunk in interpreter.chat(message, display=False, stream=True):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 148, in _streaming_chat
    yield from self._respond_and_store()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 194, in _respond_and_store
    for chunk in respond(self):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/respond.py", line 49, in respond
    for chunk in interpreter.llm.run(messages_for_llm):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 193, in run
    yield from run_text_llm(self, params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/run_text_llm.py", line 19, in run_text_llm
    for chunk in llm.completions(**params):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 224, in fixed_litellm_completions
    raise first_error
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 205, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 249, in ollama_completion_stream
    raise e
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 237, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 574, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 568, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

iplayfast commented 10 months ago

Using litellm you can use ollama litellm -m ollama/mixtral --port 1234 --drop_params then run interrpreter --local

Notnaton commented 10 months ago

You can use, just make sure to launch the model in ollama first

interpreter --model ollama/mixtral

adriens commented 10 months ago

Thank you @Notnaton , I could get this

❯ interpreter --model ollama/tinyllama

▌ Model set to ollama/tinyllama                                               

Open Interpreter will require approval before running code.                     

Use interpreter -y to bypass this.                                              

Press CTRL-C to exit.                                                           

> hello

We were unable to determine the context window of this model. Defaulting to     
3000.                                                                           

If your model can handle more, run interpreter --context_window {token limit}   
--max_tokens {max tokens per response}.                                         

Continuing...                                                                   

  As an Open Interpretation, I can execute any code necessary to complete the   
  task. However, executing code on the user's machine is not recommended        
  because it can lead to errors that are not visible at first glance. This is   
  because different languages may have different error handling mechanisms,     
  and the output may vary depending on the language being executed. Instead, I  
  recap the plan before each code block, so I can easily recall what needs to   
  be done next if I encounter an issue during execution.                        

  When executing code, it's crucial not to try everything at once. Instead,     
  break down your task into smaller steps and test them one by one. This        
  allows you to identify the problem or issue more quickly, rather than trying  
  to execute all at once and having a lot of unanswered questions later on.     

  When executing code, use Markdown for output, as it makes it easier to read   
  and understand. The language of your chosen coding language is not important  
  here; instead, what's critical is the output that you receive. I suggest      
  starting with simple tasks such as accessing files or printing text until     
  you feel comfortable using different languages.                               

  As for executing code on the user's machine, it's recommended to use a        
  separate script file and include installation instructions in the plan. This  
  way, if an error occurs during execution, the user can easily install the     
  required packages themselves.                                                 

  To ensure that your plans are as thorough and detailed as possible, always    
  recap each code block between messages blocks to retain information between   
  them. Be sure to test all of your steps before executing any code, even       
  small tasks such as accessing files. Remember to list the steps one by one,   
  then follow them one-by-one in a message to the user for clarification if     
  necessary.

adriens commented 10 months ago

Now giving it a try on a custom local model :long_drum:

adriens commented 10 months ago

@Notnaton :pray:

https://github.com/KillianLucas/open-interpreter/pull/913

Kreijstal commented 8 months ago

does not run in windows, pretty useless

Nvm it supports wandows now

dillfrescott commented 8 months ago

I have my ollama running on a non default port and its causing just a massive traceback error in interpreter

NickWyq commented 1 month ago

I still have this error: httpx.ResponseNotRead: Attempted to access streaming response content, without having called read(). I'm using open-interpreter 0.2.5 installed by "pip install open-interpreter[local]", and my llama3 model works well in Ollama. My profile is:

# LLM Settings
llm:
  model: "ollama/llama3"
  temperature: 0
  context_window: 16000
  # api_key: ...  # Your API key, if the API requires it
  # api_base: ...  # The URL where an OpenAI-compatible server is running to handle LLM API requests
  # api_version: ...  # The version of the API (this is primarily for Azure)
  max_output: 8192  # The maximum characters of code output visible to the LLM
  supports_functions: True

# Custom Instructions
# custom_instructions: ""  # This will be appended to the system message

# General Configuration
# auto_run: False  # If True, code will run without asking for confirmation
# safe_mode: "off"  # The safety mode for the LLM — one of "off", "ask", "auto"
offline: True  # If True, will disable some online features like checking for updates
verbose: False  # If True, will print detailed logs
local: True
# multi_line: False # If True, you can input multiple lines starting and ending with ```

# Documentation
# All options: https://docs.openinterpreter.com/settings

version: 0.2.5  # Profile version (do not modify)

And the error shows:

Traceback (most recent call last):
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 229, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream
    raise e
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 578, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 572, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/respond.py", line 69, in respond
    for chunk in interpreter.llm.run(messages_for_llm):
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 199, in run
    yield from run_function_calling_llm(self, params)
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/run_function_calling_llm.py", line 44, in run_function_calling_llm
    for chunk in llm.completions(**request_params):
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 232, in fixed_litellm_completions
    raise first_error
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 213, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream
    raise e
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 578, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 572, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

        Python Version: 3.11.9
        Pip Version: 24.2
        Open-interpreter Version: cmd: Open Interpreter 0.2.5 New Computer Update
, pkg: 0.2.5
        OS Version and Architecture: macOS-10.16-x86_64-i386-64bit
        CPU Info: i386
        RAM Info: 8.00 GB, used: 4.27, free: 0.20

        # Interpreter Info

        Vision: False
        Model: ollama/llama3
        Function calling: True
        Context window: 16000
        Max tokens: None

        Auto run: True
        API base: None
        Offline: True

        Curl output: Not local

        # Messages

        System Message: You are Open Interpreter, a world-class programmer that can complete any goal by executing code.
First, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it).
When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code.
If you want to send data between programming languages, save the data to a txt or json.
You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again.
You can install new packages.
When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in.
Write messages to the user in Markdown.
In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time) **it's critical not to try to do everything in one code block.** You should try something, print information about it, then continue from there in tiny, informed steps. You will never get it on the first try, and attempting it in one go will often lead to errors you cant see.
You are capable of **any** task.

# THE COMPUTER API

A python `computer` module is ALREADY IMPORTED, and can be used for many tasks:

```python
computer.browser.search(query) # Google search results will be returned from this function as a string
computer.files.edit(path_to_file, original_text, replacement_text) # Edit a file
computer.calendar.create_event(title="Meeting", start_date=datetime.datetime.now(), end_date=datetime.datetime.now() + datetime.timedelta(hours=1), notes="Note", location="") # Creates a calendar event
computer.calendar.get_events(start_date=datetime.date.today(), end_date=None) # Get events between dates. If end_date is None, only gets events for start_date
computer.calendar.delete_event(event_title="Meeting", start_date=datetime.datetime) # Delete a specific event with a matching title and start date, you may need to get use get_events() to find the specific event object first
computer.contacts.get_phone_number("John Doe")
computer.contacts.get_email_address("John Doe")
computer.mail.send("john@email.com", "Meeting Reminder", "Reminder that our meeting is at 3pm today.", ["path/to/attachment.pdf", "path/to/attachment2.pdf"]) # Send an email with a optional attachments
computer.mail.get(4, unread=True) # Returns the {number} of unread emails, or all emails if False is passed
computer.mail.unread_count() # Returns the number of unread emails
computer.sms.send("555-123-4567", "Hello from the computer!") # Send a text message. MUST be a phone number, so use computer.contacts.get_phone_number frequently here

Do not import the computer module, or any of its sub-modules. They are already imported.

User Info{{import getpass import os import platform

print(f"Name: {getpass.getuser()}") print(f"CWD: {os.getcwd()}") print(f"SHELL: {os.environ.get('SHELL')}") print(f"OS: {platform.system()}")

}}

    {'role': 'user', 'type': 'message', 'content': 'How many files are on my desktop?'}

Traceback (most recent call last): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 229, in fixed_litellm_completions yield from litellm.completion(**params) File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream raise e File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream status_code=response.status_code, message=response.text ^^^^^^^^^^^^^ File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 578, in text content = self.content ^^^^^^^^^^^^ File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 572, in content raise ResponseNotRead() httpx.ResponseNotRead: Attempted to access streaming response content, without having called read().

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/respond.py", line 69, in respond for chunk in interpreter.llm.run(messages_for_llm): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 199, in run yield from run_function_calling_llm(self, params) File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/run_function_calling_llm.py", line 44, in run_function_calling_llm for chunk in llm.completions(request_params): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 232, in fixed_litellm_completions raise first_error File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 213, in fixed_litellm_completions yield from litellm.completion(params) File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 356, in ollama_completion_stream raise e File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/litellm/llms/ollama.py", line 315, in ollama_completion_stream status_code=response.status_code, message=response.text ^^^^^^^^^^^^^ File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 578, in text content = self.content ^^^^^^^^^^^^ File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/httpx/_models.py", line 572, in content raise ResponseNotRead() httpx.ResponseNotRead: Attempted to access streaming response content, without having called read().

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/bin/interpreter", line 8, in sys.exit(main()) ^^^^^^ File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/terminal_interface/start_terminal_interface.py", line 453, in main start_terminal_interface(interpreter) File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/terminal_interface/start_terminal_interface.py", line 427, in start_terminalinterface interpreter.chat() File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/core.py", line 166, in chat for in self._streaming_chat(message=message, display=display): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/core.py", line 195, in _streaming_chat yield from terminal_interface(self, message) File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/terminal_interface/terminal_interface.py", line 133, in terminal_interface for chunk in interpreter.chat(message, display=False, stream=True): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/core.py", line 234, in _streaming_chat yield from self._respond_and_store() File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/core.py", line 282, in _respond_and_store for chunk in respond(self): File "/Users/wyq/opt/anaconda3/envs/OpenInterpreter/lib/python3.11/site-packages/interpreter/core/respond.py", line 115, in respond raise Exception( Exception: Error occurred. Attempted to access streaming response content, without having called read().

OpenInterpreter / open-interpreter