Open kurtbuilds opened 4 weeks ago
We might need to modify the prompts to remove {{REWRITTEN_CODE}}
and the triple backticks, same for {{INSERTED_CODE}}
Also, I'm not finding code in the zed repo that strips REWRITTEN_CODE
nor INSERTED_CODE
, those strings are not anywhere except the template, is it possible that is missing from handle_stream
or somewhere nearby?
I made some changes in ~/.config/zed/prompt_overrides/content_prompt.hbs
and even the small models (i.e. llama3.2:3b
) stopped inserting bogus start/stop chars when I nuked {{REWRITTEN_CODE}}
and used this last instruction instead:
Immediately start your response with no remarks before nor after, only the rewritten code:
Also, this has worked well too:
Immediately start your response in a single markdown code block (triple backticks) with no remarks before and none after.
Likewise with {{REWRITTEN_CODE}}
I have a hunch that most models trip up b/c this is confusing (end of the prompt):
Immediately start with the following format with no remarks:
```
{{INSERTED_CODE}}
```
Here is for rewrite:
Immediately start with the following format with no remarks:
```
{{REWRITTEN_CODE}}
```
Is the model supposed to add the triple backticks? Or just {{INSERTED_CODE}}
? Too much confusion IMO!
I've also seen models add {{INSERTED_CODE}}
at the end of the response too. And even worse I find the smaller models often add extra }
at the very end and start with {{INSERTED_CODE}
with only one trailing }
.
How about change it to just:
Immediately start with the following format with no remarks:
INSERTED_CODE
OR, come up with something more unique, but drop all the problematic characters and don't leave a hint that markdown might be involved.
Also maybe change to CODE_TO_INSERT
because INSERTED_CODE
is past tense for a future action.
OR, for the insert case, why do you need it to start with anything?
I also often see models add explanations AFTER the inserted/replaced code... the prompt should clarify not to do that anywhere (not before, nor after, nor during)
Another thing, I often find blank lines are removed before/after a selection. There should be some mechanism to preserve those. Either edge case code, OR, tell the model not to remove leading/trailing blank lines.
FYI here is what it looks like when REWRITTEN_CODE is inserted (using llama3.2:3b):
Notice that it includes {{REWRITTEN_CODE} with only one trailing curly brace and then at the bottom it includes double }} that are also not desired.
And, it just dawned on me that the intent of the template is not for the model to start with {{REWRITTEN_CODE}} but rather that is a placeholder... wow was that not at all obvious to me, no wonder the models are confused too!
More examples
Prepends {{REWRITTEN_CODE}}
Surrounds with {{
and }}
Here are some insert examples (using llama3.2:1b
which is highly prone to being confused):
INSERTED_CODE
before & after:
INSERTED_CODE
before only:
I opened a PR with changes to the default prompt. I have not spent a ton of time extensively testing it, so I am not married to it. Just wanted to start the conversation there. It does seem to work well with my testing using smaller llama models.
Check for existing issues
Describe the bug / provide steps to reproduce it
When doing code completions (with the inline assistant), I frequently get results with that token as an artifact. See example inline assistant output below.
I asked inline assistant to write a short shell script. The result is this:
Both the `{{INSERTED_CODE}}`` and the markdown backtick block shouldn't be in the results.
This is using Github Copilot.
Environment
Zed: v0.157.5 (Zed) OS: macOS 14.5.0 Memory: 64 GiB Architecture: aarch64
If applicable, add mockups / screenshots to help explain present your vision of the feature
No response
If applicable, attach your Zed.log file to this issue.
Zed.log