System Message overrides prime directive...

bars0um commented 7 months ago

Describe the bug

It's important to note how influential the default system message is. I've been going in circles trying to understand why the interpreter has been giving me errors like the one below.

I am currently trying to make interpreter leverage mixtral to write me a basic ruby on rails app.

It starts off quite accurately and happily tells me about the migrations it will create. But after slowly but surely writing out the code code it would kick off this archaic errror:

'ruby' disabled or not supported.

and then

Apologies for that. The given code is actually written in Ruby, and it's for creating migration files for Rails. Since I cannot execute the code, here's a summary of what this code does and what comes   next:

It took me a while to realize that it's actually trying to execute the ruby code it created but using the Python interpreter...

What clued me off is having snooped through the default system message and its resounding:

"When you execute code, it will be executed on the user's machine. The user has given you full and complete permission to execute any code necessary to complete the task. Execute the code."

I changed this to:

"You may write and execute python or bash code that creates the file the user requested in the language that they requested. please do not attempt to execute ruby code. you may only execute python or bash. please do not run migrations. only write the code that is requested by creating bash script that echos the content into the target file. "

this finally allowed me to get passed the blocking error that was happening before and made the model actually output the source code I was looking for, (well of course it tried to migrate any way because I asked it to "implement the migration" but you get what I mean)...couldn't believe how strict the model was about instructions, anyhow, glad that's sorted.

I've opened this issue just because others are likely to hit this; there ought to be some sort of warning in the doc to highlight how important the system message instructions are and that the default one needs to be considered in the context of what you are doing.

Reproduce

Ask the interpreter to write you any ruby code.

Expected behavior

A file to be created in the target folder containing the ruby code.

Screenshots

Open Interpreter version

0.2.2

Python version

3.11

Operating System name and version

Debian-base docker container

Additional context

No response

bars0um commented 7 months ago

Result after overriding the system message and being very emphatic in the prompt... XD

MikeBirdTech commented 7 months ago

Hey @bars0um

I'm happy you got it working! Love that you dove in, changed something, and made the tool better for you

Thank you for sharing your experience. This is a great feedback. Are you in the Discord? https://discord.gg/Hvz9Axh84z

bars0um commented 7 months ago

Thanks for inviting me @MikeBirdTech and thanks for the great effort on this project!

MikeBirdTech commented 7 months ago

Glad you joined and are a part of this community @bars0um!

In regards to this issue, what do you have in mind for documentation updates? Did you want to take a shot at making the changes?

Thanks!

bars0um commented 7 months ago

@MikeBirdTech Sure, I would love to contribute. I think it warrants a discussion though; is the aim of the open-interpreter project to be a general-purpose enabler to AI models? I guess my thinking is that if I were to fit-out open-interpreter to better serve a specific use (in this case becoming a sort of junior developer), it may perform better at that task but fall short on others.

Also, I'm noticing more and more that the model you use is crucial. If it maps and retains the instructions well, it performs better. For ones that handle less tokens or complexity those seem to fall off pretty early (I've also noted that hardware can affect model performance...and even where you run it, for example ollama on windows seems to leverage my resources better than on WSL2).

Perhaps open-interpreter can load different "personas". Kind of like openai assistants. The personas would each have their special set of instruction with the appropriate wording. For example, for my attempt at making an AI ruby on rails developer, I find that I'm scouring the open-interpreter source code for hardcoded instructions and modifying those to ensure that nothing conflicts and that the end result is consistent.

Perhaps simply putting the hardcoded strings into loadable files such as personaX.json and using constants to load the correct string would help folks create different kinds of "assistants" that perform better for the purpose they have.

MikeBirdTech commented 7 months ago

@bars0um Check out interpreter/terminal_interface/profiles to learn about profiles, which seems to be what you're looking for.

We definitely don't want to overfit OI for a specific use-case but we aim to build it in a way where custom profiles can greatly increase its capabilities in different use-cases

bars0um commented 7 months ago

@MikeBirdTech I guess profiles could be the way to do this yes. I noticed that you could replace the system message that way for example.

I was thinking though that it would be better to distinguish between a profile and a persona. A profile would be more setting-orientated, like choosing the model, setting the max tokens etc. Persona would basically define the character that the model needs to assume to carry out the task along with the appropriate instructions for the function at hand.

For example, at present, with profiles, there is no way to override the instructions in this section: https://github.com/KillianLucas/open-interpreter/blob/68ab324b72064aac261fc662c2ceaaaf150c75a8/interpreter/core/core.py#L53

or here:

https://github.com/KillianLucas/open-interpreter/blob/68ab324b72064aac261fc662c2ceaaaf150c75a8/interpreter/core/llm/run_text_llm.py#L8

The instructions in those lines for example are sorta context specific, which ends up throwing off a task like mine and makes the model diverge a bit...also I think this is caused by the feedback loops where the output from executing code is fed back to the model.. I think this might need to considered carefully in light of the initial target. For example, in my case, I might simply want to stop or ask it to review the code it wrote, and consider the initial request and to determine whether improvements are warranted, instead of executing the code and looking at the output...

MikeBirdTech commented 7 months ago

@bars0um I've been exploring the idea of having every aspect of the prompt to be customizable. Finding these "uneditable" ones are good to know. There's a lot of work to be done with prompting, so I think optimizing that would take precedent over introducing the new concept of personas, but I'm absolutely open to hearing more of your thoughts on it

bars0um commented 7 months ago

I love this persuasive weight-setting technique 🤣

https://github.com/OpenInterpreter/open-interpreter/blob/0029ddb266b0acbba7228131a116d702ce40d4f3/interpreter/core/computer/skills/skills.py#L85

bars0um commented 7 months ago

@MikeBirdTech I've made an initial effort in adding ruby support ( see https://github.com/OpenInterpreter/open-interpreter/pull/1105/files)

It doesn't necessarily help me on the rails front but it seemed like a basic step that should be there anyhow.

MikeBirdTech commented 7 months ago

Reviewed the PR. Initial look is good but I need more eyes on it, as I'm not a Ruby guy.

OpenInterpreter / open-interpreter