Agenta-AI / agenta

The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.
http://www.agenta.ai
MIT License
1.2k stars 184 forks source link

Fea: Autogenerate test scenarios in playground and test sets #219

Closed mmabrouk closed 5 months ago

mmabrouk commented 1 year ago

Is your feature request related to a problem? Please describe. Users don't always have enough test data to test their llm app.

Describe the solution you'd like A button in the playground to auto-generate a new test point. Clicking the button would call gpt3-5 with a prompt that includes the original prompt in the variant, the previous test points, and ask it to provide a new test point. The test point would be added as a new row in the playground.

Technical Details

The execution of the OpenAI call would be as follows:

Invoke gpt-3.5-turbo-0613 (refer to https://platform.openai.com/docs/guides/gpt/function-calling) with a prompt similar to the following:

  messages: [
    {
      role: "system",
      content: "The user is testing multiple data points against a prompt. Please generate a unique data point distinct from the existing ones.",
    },
    {
      role: "user",
      content: `User prompts:\n---\n // In this space, we'll insert the list of user prompts from the playground
    },
    {
      role: "assistant",
      content: null,
      function_call: {
        name: "add_data_point",
        arguments: {input1: "value1", input2: "value2"}, // current data input 1 from the playground
      },
    },
    {
      role: "assistant",
      content: null,
      function_call: {
        name: "add_data_point",
        arguments: {input1: "value1afds", input2: "value2adfs"}, // current data input from box 2 in the playground
      },
    }
.... continuing for the initial n boxes within the playground
  ],
  functions: [
    {
      name: "add_data_point",
      parameters: {
        type: "object",
        properties: {
          input1: { type: "string" },
          input2: { type: "string" },
        },
      },
    },
  ]

Essentially, upon clicking the auto-generate new row button within the playground, we extract the prompts from the variant in the tab in conjunction with the first n inputs already present, and generate a prompt similar to the one described above. This prompt is then sent to OpenAI. The response, for example, add_data_point(input1: somethingtheygenerate, input2:somethingtheygenerate), is parsed, and a new row is generated based on it.

Notes

suadsuljovic commented 1 year ago

@mmabrouk Hello, I had some personal stuff to attend to last two weeks. So I wasn't active much.

I started working on this today. Where do you want the api call to openAI to be on the backend or the frontend? I will do a deep dive into chatGPT api docs. Hopefully I will figure it our until end of the day what I need to use.

mmabrouk commented 1 year ago

@suadsuljovic great to have you back :) I don't see an advantage of doing the calls from the backend. I think we can keep it in the frontend, the keys are saved there anyways for now. We can easily refactor it later to the backend if needed.

mmabrouk commented 1 year ago

Hey @suadsuljovic any updates from your side?

suadsuljovic commented 1 year ago

Hello, sorry I was mostly busy with looking for a new job so I forgot about this.

I will try to finish it until the end of the week.

If I don't just assign this to someone else.

mmabrouk commented 1 year ago

Thanks @suadsuljovic !