chore(autofix): Switch to claude for root cause step

getsentry / seer

Seer is a service that provides AI capabilities to Sentry, running inference on Sentry issues and providing insights to users.

https://sentry.io/lp/ai-ml-beta/

Other

11 stars 0 forks source link

chore(autofix): Switch to claude for root cause step #1384

Closed roaga closed 1 week ago

roaga commented 1 week ago

Model switch from GPT 4o to Claude 3.5 Haiku results in better latency, better coding, and better root cause results.

(see row 2 below):

trillville commented 1 week ago

how much higher is the error rate? hard to tell from these metrics

roaga commented 1 week ago

how much higher is the error rate? hard to tell from these metrics

Good runs: 98 Errored runs: 5 Error rate: 0.04 Errored in root cause: 0 Errored in plan: 3 Error rate in root cause: 0.00 Error rate in plan: 0.02 Error rate in something after plan: 0.02 Runs with unapplied changes: 24 Missing change rate: 0.19

These are the correct numbers, the eval script is bugged @trillville

So actually error rate is unaffected

roaga commented 1 week ago

actually going for haiku now instead of sonnet, evals are better