All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
33.12k stars 3.79k forks source link

[Mega Issue] Agent Quality: Editing #3231

Open xingyaoww opened 2 months ago

xingyaoww commented 2 months ago

What problem or use case are you trying to solve?

OpenDevin hits a bunch of issues when trying to perform edits:

Describe the UX of the solution you'd like

Integrate / Build a light-weight evaluation for editing:

(Longer-term) Collect data and train a specialized model for editing

Do you have thoughts on the technical implementation?

Describe alternatives you've considered

Additional context

neubig commented 2 months ago

Here is an example of a prompt that causes lots of editing issues: https://www.all-hands.dev/share-opendevin?share_id=4e4841bd6240dc2ee334742dee59f5104b5c982ff9309aaeea5ce977968bfcf3

li-boxuan commented 2 months ago

Related issue: https://github.com/OpenDevin/OpenDevin/issues/3412

tobitege commented 2 months ago

Related issue: #3452

James4Ever0 commented 1 month ago

@neubig @tobitege @li-boxuan @xingyaoww

Editing can be implemented as Terminal or GUI agent operations. For example:

Cybergod is implementing these kind of agents.

If you want to find some legacy agentic editors, check code here and docs here.

neubig commented 1 month ago

Interesting @James4Ever0 ! Have you compared the accuracy of something like this compared to other methods like the ones implemented in OpenHands or SWE-Agent?

James4Ever0 commented 1 month ago

Interesting @James4Ever0 ! Have you compared the accuracy of something like this compared to other methods like the ones implemented in OpenHands or SWE-Agent?

Benchmarking has not yet been done since this is just a library for terminal interaction and the actual terminal agent implementation is the most important factor in performance, which is on my TODO list.

This method is promising since it allows direct interaction with almost all terminal programs, including those rendering complex graphics (like gameplay and QR codes) and layouts. So if there is anyone interested in implementing this terminal agent, I will do my best to answer all questions about my code and related topics.