block / goose

Goose is a developer agent that operates from your command line to help you do the boring stuff.
https://block.github.io/goose/
Apache License 2.0
109 stars 17 forks source link

feat: In safe-mode, asked the user whether they want to continue or cancel before any possible changes are made #122

Open lifeizhou-ap opened 2 weeks ago

lifeizhou-ap commented 2 weeks ago

Why At the moment, when the plan is kicked off, the tasks are executed automatically. Users do not have much control to review potential changes that the task will make. It is hard for user to make the decision whether they want to proceed, skip the task or cancel the plan except using CTRL-C

What Enable a safe-mode (can be renamed) to provide some control for the user to decide whether to continue before running the tools of

Make the diff more readable

Shell Before the shell script is executed, the user is shown with the explanation of the command. The ai will also determine whether the command will do any changes on the users' computer(via prompt). If it does, the user is will be prompted whether they want to continue to execute the command. If yes, command will be executed. If no, the task will be cancelled and all the subsequence task will be cancelled. (We can also provide skip option so that that specific task can be skipped, and polish the experiences)

patch_file and write_file Before changing the file, user are show the diff of the changes in the similar way of git diff. The user will be prompted whether they want to change the file. If yes, file will be changed. If no, file won't be changed and will move to the next task. (We can add yes to all options later on, so goose can remember the user decision in the plan)

https://github.com/user-attachments/assets/ab51fa7b-5cc7-4689-adba-d869836fa7b7

https://github.com/user-attachments/assets/18fc4de2-50ac-4f85-8455-5a7d6d8a60f5

https://github.com/user-attachments/assets/31d2fda6-c8c5-4e18-8d65-4656a2193aa2

Note This PR is a draft PR and the changes is a bit unorganised at the moment. The main purpose is to get your feedback about whether the safe-mode is useful for our users. I can polish and refactor the code afterwards.

codefromthecrypt commented 2 weeks ago

related: it would be nice for toolkits to have a means to declare the system packages they depend on. In this way, we could test for the binary, like gh, vs having goose attempt to auto-install it.

lifeizhou-ap commented 2 weeks ago

related: it would be nice for toolkits to have a means to declare the system packages they depend on. In this way, we could test for the binary, like gh, vs having goose attempt to auto-install it.

Hey @codefromthecrypt Thanks for the feedback.

Just to confirm whether I understand your comments correctly. Do you mean we can have something smart in toolkit that they can know the packages on my local machine. For example, if I have already have packages on my local machine to achieve the task, the plan won't install another tool to achieve the same task. Is my understanding correct?

codefromthecrypt commented 2 weeks ago

Do you mean we can have something smart in toolkit that they can know the packages on my local machine.

yeah right now, you can run a command and it might try to apt-install a system package. If the toolkits expose any binaries, or "exit code zero" commands they need, goose could obviate attempts to load that toolkit. OTOH, this won't prevent normal prompts from also attempting the same thing in a toolkit.

Here's pseudocode of what it might look like

--- a/src/goose/toolkit/github.py
+++ b/src/goose/toolkit/github.py
@@ -9,3 +9,7 @@ class Github(Toolkit):
     def system(self) -> str:
         """Retrieve detailed configuration and procedural guidelines for GitHub operations"""
         return Message.load("prompts/github.jinja").text
+
+    def system_prereqs(self) -> Tuple(str):
+        """List of commands that must complete success for this toolkit to operate"""
+        return ("gh", "gh status")
lifeizhou-ap commented 2 weeks ago

Hi @lamchau,

Thanks for the review! The code is a bit unorganised now. At the moment with this draft PR I would like to get the feedback about this feature first, like whether it will be useful or it is worthwhile for experiment. If we think it would be useful, I will continue this PR to polish the code.

lamchau commented 2 weeks ago

@lifeizhou-ap it's worth experimenting with! i currently do this with a .goosehints-dry-run which i need to swap so i can bench/compare results

michaelneale commented 1 week ago

I think some people would appreciate this feature, as it can be a surprise how "biased to action" goose is out of the box (I like it myself so likely wouldn't use this mode, but if it isn't adding a lot of complexity to it it is an interesting feature to add in as an option)