A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-bench lite and 38.40% tasks (pass@1) in SWE-bench verified with each task costs less than $0.7.
Hello, I see you added new supported models. Can you provide an evaluation of them on SWE-bench so that it can be compared with the evaluations already done?
Hello, I see you added new supported models. Can you provide an evaluation of them on SWE-bench so that it can be compared with the evaluations already done?
Thank you