awslabs / agent-evaluation

A generative AI-powered framework for testing virtual agents.
https://awslabs.github.io/agent-evaluation/
Apache License 2.0
64 stars 10 forks source link

Formalize test plan schema #21

Open sharonxiaohanli opened 3 months ago

sharonxiaohanli commented 3 months ago

Currently test plan is defined as yaml file, for example for the target field

target:
  type: bedrock-agent
  bedrock_agent_id: <>
  bedrock_agent_alias_id: <>

If converting the current yaml to json schema it would be like:

"target": {
      "type": "object", 
      "properties": {
        "type": {"type": "string"},
        "bedrock_agent_id": {"type": "string"},
        "bedrock_agent_alias_id": {"type": "string"} 
      },
      "required": ["type", "bedrock_agent_id", "bedrock_agent_alias_id"]
    }

which doesn't work with other target type making it hard to validate the test plan without actually running the test. Proposing to have a json schema defined for the test plan file like:

"target": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": ["bedrock-agent", "q-business"] 
        },
        "attributes": {
          "oneOf": [
            {
              "properties": {
                "bedrock_agent_id": {"type": "string"},
                "bedrock_agent_alias_id": {"type": "string"}
              },
              "required": ["bedrock_agent_id", "bedrock_agent_alias_id"],
              "if": {"properties": {"type": {"const": "bedrock-agent"}}}
            },
            {
              "properties": {
                "q_business_application_id": {"type": "string"}, 
                "q_business_user_id": {"type": "string"}
              },
              "required": ["q_business_application_id", "q_business_user_id"],
              "if": {"properties": {"type": {"const": "q-business"}}}
            }
          ]
        }
      },
      "required": ["type", "attributes"]
    }

so that:

  1. yaml and json test plan can be interchangeable
  2. test plan file can be validated on the format

Acceptance criteria:

  1. json format for the test plan file in-place
  2. when triggering the runner, validate the test plan format and throw clear errors if malformatted
  3. add unit test
tonykchen commented 2 months ago

Hey @sharonxiaohanli, this is a great idea! Lets tackle this post 0.1.0 release.