plandex-ai / plandex

AI driven development in your terminal. Designed for large, real-world tasks.
https://plandex.ai
GNU Affero General Public License v3.0
10.54k stars 730 forks source link

Plan much too long for repeating test fixes #178

Open viraptor opened 1 month ago

viraptor commented 1 month ago

I've tried to get Plandex to fix a few tests where the expected values needed updating. I was explicit about the task with:

The values in the tests need to be updated to match the new results. Here are the failures: (8 cases of test failures including "expected ... got ..." from an rspec run)

Only one Ruby file in context, with ~200 lines. (Company internal code, so can't provide it, but it's trivial to produce a similar one)

Unfortunately that seems to have been split into way too many tiny tasks. 50 gpt4o requests later, Plandex was still working on the second failing test.

I'm hoping there's some prompt adjustment that could help with tiny repeated tasks like that.

danenania commented 1 month ago

Thanks for reporting this @viraptor—we're working on evals to improve results for these kinds of tasks.