Hosted multimodal models from Open Router currently don't work on Open Interpreter

Merlinvt commented 2 months ago

Describe the bug

I am trying to use multimodal models from Open Router (Claude, Llava, ChatGPT, Gemini...) but it seems that the current implementation does not support that. (Or at least I have not figured it out).

In the settings documentation and documentation on how to use hosted models from Open Router https://docs.openinterpreter.com/language-models/hosted-models/openrouter

It says I need to specify the model like this in my profile YAML: model: "openrouter/anthropic/claude-3-haiku"

That works fine if I want to use a non-multimodal model.

If I want to use a multimodal model, I need to specify the API base for Open Router as: api_base: https://openrouter.ai/api/v1/chat/completions

If I do that, the Open Router automatically puts "openai/" in front of the model name.

When I start Open Interpreter with: interpreter --profile openrouter.yaml I get the message: interpreter --profile openrouter.yaml

I think the issue is in start_terminal_interface lines 401-410. interpreter.llm.model = "openai/" + interpreter.llm.model

It can be fixed by including the line: and "openrouter" not in interpreter.llm.api_base.lower()

I haven't done a lot of coding recently or contributed to open-source projects. I'm happy to open a pull request if that's not overkill for such a small thing.

If I do open a pull request, I could try to fix the documentation for Open Interpreter with multimodal models, since it would currently not work with multimodal models.

Thanks for your awesome project :)

Reproduce

The following settings in the profile .yaml

api_base: https://openrouter.ai/api/v1/chat/completions model: "openrouter/anthropic/claude-3-haiku"

results in: Model set to openai/openrouter/anthropic/claude-3-haiku

Expected behavior

Should result in:

Model set to openrouter/anthropic/claude-3-haiku

Screenshots

No response

Open Interpreter version

0.2.5

Python version

3.11.9

Operating System name and version

Ubuntu 22

Additional context

No response

Merlinvt commented 2 months ago

Here is the documentation on OpenRouter about multi-modal models. https://openrouter.ai/docs#images-_-multimodal-requests

Merlinvt commented 1 month ago

Sorry my mistake ... litellm already implements this. I should have just omitted the api_base: https://openrouter.ai/api/v1/chat/completions

OpenInterpreter / open-interpreter