A/B test user prompts - Githubissues

calebsheridan commented 7 months ago

Add ability for A/B testing user prompts.

Migrate from single prompt to list of prompts
Add UI controls to add/remove prompts
Modify prompt dialogue
Add prompt details to grid results

Notes:

Merge caused system prompt to no longer work with prompt dialogue (temporarily disabled).

dezoito commented 7 months ago

Hi @calebsheridan.

Can you provide a description of what the PR does?

calebsheridan commented 7 months ago

@dezoito added description

calebsheridan commented 7 months ago

dezoito commented 7 months ago

@calebsheridan , first of all, thank you again for the PRs and the effort and detail you've put into the updates.

I really liked the way you solved the "multi-prompt problem" in a way that keeps the interface intuitive and clean and would like to discuss some issues before merging, if you are OK with it:

1) System Prompt

Merge caused system prompt to no longer work with prompt dialogue (temporarily disabled).

Do you see any way that could be added back?

Some people write complex system prompts, and I wish we could retain the ability to open a large "editor" so that they don't have to switch to a different program to make changes comfortably.

2) Displaying prompts for each iteration

Another issue is how to display the prompts used for each result

Consider the current screenshot below:

I feel like it would be interesting to retail formatting (like line breaks), when displaying the prompts, and the current order and colors make it difficult to differentiate from the rest of the parameters.

When inspecting past experiments, this is a little more clear (although I admit I am not preserving the line breaks yet):

I'd like your opinion on two possible approaches:

2.1- Move the prompt to the bottom of the inference parameters, and maybe add some spacing/different color to differentiate it from the other parameters.

OR

2.2- Put the prompt in an "accordion" at the bottom of the inference parameters, and use just the first "N" characters as the accordion trigger.

I feel like option 2.2 would work better for large prompts and, In both options and in the experiment results, line breaks should be preserved.

3) Display prompts when inspecting past experiments.

Currently, since all inferences use the same prompt, the ExperimentDataDialog component just uses the one stored with the first inference:


                    <div className="p-1 font-mono text-gray-700 dark:text-gray-400">
                      {data.inferences[0].parameters.system_prompt}
                    </div>
                    <div className="p-1 font-mono text-gray-700 dark:text-gray-400">
                      {data.inferences[0].parameters.prompt}
                    </div>

I feel like we could keep this logic for the System Prompt, but each iteration should display the corresponding prompt somehow (possibly using the same component mentioned in the previous point.

I'm willing to work on points 2 and 3, but it might take some time until I can touch this.

Please let me know how you feel about these observations.

calebsheridan commented 7 months ago

OK
OK, we can try both
OK

At some point, it would be nice to test multiple system prompts also.

For prompts in general, I felt that a nice extension to this PR would be a local library of prompts where each prompt can be selected/deselected instead of simply added or removed (in other words, similar to how model selection works now). See https://github.com/dezoito/ollama-grid-search/issues/20

dezoito commented 7 months ago

OK

OK, we can try both

OK

Thank you!

At some point, it would be nice to test multiple system prompts also.

For prompts in general, I felt that a nice extension to this PR would be a local library of prompts where each prompt can be selected/deselected instead of simply added or removed (in other words, similar to how model selection works now). See #20

I agree on both points... going to continue this discussion in #20 .

dezoito commented 6 months ago

Merged to main. Thank you, @calebsheridan!

I'll update the README to highlight the new features and try to work on the remaining updates, then generate a new release.

dezoito / ollama-grid-search

A/B test user prompts #18

1) System Prompt

2) Displaying prompts for each iteration

3) Display prompts when inspecting past experiments.