zachblume / autospec

Autospec is an open-source AI agent that takes a web app URL and autonomously QAs it, and saves its passing specs as E2E test code
https://autospec.dev
MIT License
47 stars 4 forks source link

Benchmark spec execution seperately from discovery - spec execution: #56

Open zachblume opened 4 months ago

zachblume commented 4 months ago

Adjust benchmark to be a string diff of execution->% correct API calls (I assume we want to award 0 points for the wrong overall pass/fail Boolean answer though)