OpenInterpreter / open-interpreter

A natural language interface for computers
http://openinterpreter.com/
GNU Affero General Public License v3.0
52.22k stars 4.61k forks source link

Ollama is a better LLM server for local #856

Closed iplayfast closed 8 months ago

iplayfast commented 8 months ago

Is your feature request related to a problem? Please describe.

I'm using ollama for many things, running lm-studio for this seems wrong as it only runs as an app image.

Describe the solution you'd like

Directly support ollama 127.0.0.1:11434 from langchain import ollama

Describe alternatives you've considered

lm-studio is ok, but ollama is better.

Additional context

No response

Notnaton commented 8 months ago

You can use ollama with

Interpreter --api_base http://localhost:port/v1 
mak448a commented 8 months ago

You can use ollama with

Interpreter --api_base http://localhost:port/v1 

I tried interpreter --api_base http://localhost:11434/v1 --vision --local but it didn't work. Any ideas? Exception: Error occurred. OpenAIException - 404 page not found

hidek84 commented 8 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm==1.16.7 works for me. (You need the ollama/ prefix in the --model argument)

interpreter --model ollama/codellama:7b
richstav commented 8 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm=1.16.7 works for me. (You need the ollama/ prefix in the --model argument)

interpreter --model ollama/codellama:7b

Perfect timing! Same issue as op, this solved it.

mak448a commented 8 months ago

You may no longer need the --api_base option when you connect to the Ollama local model endpoint. LiteLLM (open-interpreter's dependency) will handle it behind the scenes.

The following command with open-interpreter==0.1.18 and litellm=1.16.7 works for me. (You need the ollama/ prefix in the --model argument)

interpreter --model ollama/codellama:7b

Thanks! I remembered something like this but I forgot what it was exactly.

Kreijstal commented 8 months ago

does not run in windows, pretty useless

adriens commented 8 months ago

I could not make it run on my linux:

 interpreter --model ollama/codellama:7b

▌ Model set to ollama/codellama:7b                                            

Open Interpreter will require approval before running code.                     

Use interpreter -y to bypass this.                                              

Press CTRL-C to exit.                                                           

> hello

We were unable to determine the context window of this model. Defaulting to     
3000.                                                                           

If your model can handle more, run interpreter --context_window {token limit}   
--max_tokens {max tokens per response}.                                         

Continuing...                                                                   

        Python Version: 3.11.7
        Pip Version: 23.3.1
        Open-interpreter Version: cmd:Interpreter, pkg: 0.2.0
        OS Version and Architecture: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
        CPU Info: x86_64
        RAM Info: 7.57 GB, used: 2.12, free: 0.78

        # Interpreter Info

        Vision: False
        Model: ollama/codellama:7b
        Function calling: None
        Context window: None
        Max tokens: None

        Auto run: False
        API base: None
        Offline: False

        Curl output: Not local

        # Messages

        System Message: You are Open Interpreter, a world-class programmer that can complete any goal by executing code.
First, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it).
When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code.
If you want to send data between programming languages, save the data to a txt or json.
You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again.
You can install new packages.
When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in.
Write messages to the user in Markdown.
In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time) **it's critical not to try to do everything in one code block.** You should try something, print information about it, then continue from there in tiny, informed steps. You will never get it on the first try, and attempting it in one go will often lead to errors you cant see.
You are capable of **any** task.

        {'role': 'user', 'type': 'message', 'content': 'hello'}

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 221, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 249, in ollama_completion_stream
    raise e
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 237, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 574, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 568, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/bin/interpreter", line 8, in <module>
    sys.exit(interpreter.start_terminal_interface())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 25, in start_terminal_interface
    start_terminal_interface(self)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/terminal_interface/start_terminal_interface.py", line 684, in start_terminal_interface
    interpreter.chat()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 86, in chat
    for _ in self._streaming_chat(message=message, display=display):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 113, in _streaming_chat
    yield from terminal_interface(self, message)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/terminal_interface/terminal_interface.py", line 135, in terminal_interface
    for chunk in interpreter.chat(message, display=False, stream=True):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 148, in _streaming_chat
    yield from self._respond_and_store()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/core.py", line 194, in _respond_and_store
    for chunk in respond(self):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/respond.py", line 49, in respond
    for chunk in interpreter.llm.run(messages_for_llm):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 193, in run
    yield from run_text_llm(self, params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/run_text_llm.py", line 19, in run_text_llm
    for chunk in llm.completions(**params):
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 224, in fixed_litellm_completions
    raise first_error
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/interpreter/core/llm/llm.py", line 205, in fixed_litellm_completions
    yield from litellm.completion(**params)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 249, in ollama_completion_stream
    raise e
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/litellm/llms/ollama.py", line 237, in ollama_completion_stream
    status_code=response.status_code, message=response.text
                                              ^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 574, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/site-packages/httpx/_models.py", line 568, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.
iplayfast commented 8 months ago

Using litellm you can use ollama litellm -m ollama/mixtral --port 1234 --drop_params then run interrpreter --local

Notnaton commented 8 months ago

You can use, just make sure to launch the model in ollama first

interpreter --model ollama/mixtral
adriens commented 8 months ago

Thank you @Notnaton , I could get this

 interpreter --model ollama/tinyllama

▌ Model set to ollama/tinyllama                                               

Open Interpreter will require approval before running code.                     

Use interpreter -y to bypass this.                                              

Press CTRL-C to exit.                                                           

> hello

We were unable to determine the context window of this model. Defaulting to     
3000.                                                                           

If your model can handle more, run interpreter --context_window {token limit}   
--max_tokens {max tokens per response}.                                         

Continuing...                                                                   

  As an Open Interpretation, I can execute any code necessary to complete the   
  task. However, executing code on the user's machine is not recommended        
  because it can lead to errors that are not visible at first glance. This is   
  because different languages may have different error handling mechanisms,     
  and the output may vary depending on the language being executed. Instead, I  
  recap the plan before each code block, so I can easily recall what needs to   
  be done next if I encounter an issue during execution.                        

  When executing code, it's crucial not to try everything at once. Instead,     
  break down your task into smaller steps and test them one by one. This        
  allows you to identify the problem or issue more quickly, rather than trying  
  to execute all at once and having a lot of unanswered questions later on.     

  When executing code, use Markdown for output, as it makes it easier to read   
  and understand. The language of your chosen coding language is not important  
  here; instead, what's critical is the output that you receive. I suggest      
  starting with simple tasks such as accessing files or printing text until     
  you feel comfortable using different languages.                               

  As for executing code on the user's machine, it's recommended to use a        
  separate script file and include installation instructions in the plan. This  
  way, if an error occurs during execution, the user can easily install the     
  required packages themselves.                                                 

  To ensure that your plans are as thorough and detailed as possible, always    
  recap each code block between messages blocks to retain information between   
  them. Be sure to test all of your steps before executing any code, even       
  small tasks such as accessing files. Remember to list the steps one by one,   
  then follow them one-by-one in a message to the user for clarification if     
  necessary.
adriens commented 8 months ago

Now giving it a try on a custom local model :long_drum:

adriens commented 8 months ago

@Notnaton :pray:

Kreijstal commented 6 months ago

does not run in windows, pretty useless

Nvm it supports wandows now

dillfrescott commented 6 months ago

I have my ollama running on a non default port and its causing just a massive traceback error in interpreter