prompting agent from human voice

shiffman commented 5 months ago

Next up! The agent should begin by having a conversation with me! I'll discuss what challenge we are going to work on and then set it going to try to beat me.

From a prompt engineering standpoint, we'll need a "system instruction" which explains the [SPEAK] and [EDITOR] format, that will always be prepended to the conversation. See: https://replicate.com/blog/how-to-prompt-llama

shiffman commented 5 months ago

System prompt already exists (thanks @supercrafter100!) so just a matter of swapping out the starting prompt with what i say!

dipamsen commented 5 months ago

Current system prompt:

https://github.com/CodingTrain/Bizarro-Devin/blob/25fc6e4a137b9186c9a61f51df2ac1fb3a0bf3cd/src/prompt.js#L1-L29

Replicate history management

https://github.com/CodingTrain/Bizarro-Devin/blob/25fc6e4a137b9186c9a61f51df2ac1fb3a0bf3cd/src/lib/agent/providers/replicateProvider.js#L23-L55

We just need to continue agent.prompt() with next contexts (i.e. transcribed data), which will automatically add it to the message history and query the model properly.

shiffman commented 5 months ago

I'm getting some odd behaviors (where INST is included in the replies and the model hallucinates my part of the conversation.) We may need to double check that we are formatting the prompts correctly, here is a test I ran on replicate's server.

https://replicate.com/p/7pdmkc4biotcu3oyrtekaf2juu

MLC is currently not using any LoRAs.
Your formatted prompt is:
[INST] <<SYS>>
You are an unhinged p5.js coding AI agent talking to Dan Shiffman. You're very unpredictable and always entertaining. You enjoy chatting causally with curt and short responses full of sass, never more than one sentence. Do not write any code unless Dan asks you to and do not use any emojis. When speaking output [SPEAK] before what you are going to say. For example:
[SPEAK]
Hi Dan! How are you today?
When Dan asks you to create a p5.js sketch you both speak and write code. Output [SPEAK] before any narration that you would say and [EDITOR] for any code you are writing. Here is an example:
[SPEAK]
I am going to create a canvas that is 400 pixels wide and 400 pixels tall. I will then draw a circle in the center of the canvas.
[EDITOR]
function setup() {
createCanvas(400, 400);
}
[SPEAK]
Now I am going to draw a circle in the center of the canvas.
[EDITOR]
function draw() {
circle(200, 200, 50);
}
<</SYS>>
Hello, who are you? [/INST]
Not using LoRA
hostname: model-dp-542693885b1777c98ef8c5a98f2005e7-78ff58d588-szcxz

dipamsen commented 5 months ago

What is the above test supposed to show? The output seems correct.

Anyways in the extension there isn't any problem when we send the first prompt, the issue only occurs when we format the message history and send it, for the second or third prompt.

shiffman commented 5 months ago

Yes, I was just pasting it in for reference to make sure our system matches what Replicate is doing in the web interface. This test is a better one to show where problems arise?

https://replicate.com/p/klaoj2ebaxiszjpr4np5wvpdke

I think the issue may be that there is an outer [INST] for the full prompt and which results in [\INST][\INST] at the very end. But this wouldn't happen when we are calling the model via the API, right?

dipamsen commented 5 months ago

That shouldn't happen in theory when calling the api, if we correctly change the "prompt template" to move the outer INST. However, doing so in the web interface does not have any effect (i.e. still there are two closing INSTs). I am not sure if this is only the case on web and it works in the api, or if it is a bug in replicate.

CodingTrain / Bizarro-Devin

prompting agent from human voice #39