awslabs / agent-evaluation

A generative AI-powered framework for testing virtual agents.
https://awslabs.github.io/agent-evaluation/
Apache License 2.0
117 stars 20 forks source link

PyPI - Version PyPI - Python Version GitHub License security: bandit Code style: black Built with Material for MkDocs

Agent Evaluation

Agent Evaluation is a generative AI-powered framework for testing virtual agents.

Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.

✨ Key features

📚 Documentation

To get started, please visit the full documentation here. To contribute, please refer to CONTRIBUTING.md

👏 Contributors

Shout out to these awesome contributors: