microsoft / WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
https://microsoft.github.io/WindowsAgentArena
MIT License
254 stars 15 forks source link

How to deploy Navi Agent on my current OS and test a custom task #27

Open Calvvnono opened 3 days ago

Calvvnono commented 3 days ago

Hello,

I would like to directly deploy the Navi Agent on my current operating system (without the need for virtual machines or cloud setups) and test a specific custom task. My goal is to observe the agent's performance in real-time on my local environment.

Could you please clarify if this is feasible, and if so, what steps should I follow to achieve this?

Any guidance on this process would be greatly appreciated.

Thank you!

francedot commented 3 days ago

This feature is on our roadmap, but unfortunately, I can’t provide a specific timeline. It will require some modifications to the current Docker setup. We welcome contributions, so if you feel up to the task, here's some guidance to help you get started:

  1. In the current architecture, both the client and server processes are hosted in the same Docker container. Specifically, the VM runs within the Docker container and executes the server (VM Controller). You can refer to this diagram for the components involved. To achieve this, you will need to decouple the client and server, and run the server directly on your Windows host.
  2. To run the benchmark tasks, you'll first need to set up your environment with the necessary tools and programs. The setup.ps1 PowerShell script provides what’s required to do so. You might want to generalize the script so it works both on your host and within the current Docker setup.
  3. Next, you will need to run the Python server (VM Controller) component directly on your Windows host. To do this, create a new Conda environment on your Windows system and run pip install -r requirements from this requirements file.
  4. The client process will need to point to the running py server on your localhost. You can find the references that need to be updated in the client here.
  5. The user folder, where programs are installed and recordings are stored, will need to be abstracted to support both Docker and the host setup. The relevant references to update are here.
Calvvnono commented 2 days ago

Much appreciation for your guidance!

Another issue that one of my colleagues came across is that during the process of buliding WAA golden image, the terminal was stuck at displaying starting arena... and the 'localhost:8006' site also showing nothing. He didn't use the official Microsoft iso as README guides because it was too slow to download. Could this be the reason, or are there other possible causes?

francedot commented 2 days ago

Yes, I have seen this error before when the Windows 11 Enterprise Evaluation ISO (90-day trial) is not used. This is the only edition of Windows 11 that we support.

Calvvnono commented 2 days ago

Thank you again for your response, which helped us investigate the issue! Actually after submitting the issue, we redownloaded the specified image and went through the deployment process from the beginning, but we got stuck at the same point again. This is quite frustrating, as we barely have way to pinpoint where the issue lies. We’ve checked something we could think of, including KVM, Customizing resource, etc. Could you provide any other potential sources of the problem or share any valuable insights?

f8150262f3f62bd379fde2b4b9001d7
francedot commented 2 days ago

Please follow the same troubleshooting steps as outlined in this issue.

Let me know if you have any questions.

Calvvnono commented 2 days ago

You can troubleshoot any errors occurring during the preparation phase of the golden image (./run-local --prepare-image true), by looking at the logs under [src/win-arena-container/vm/setup/ps_script_log.txt] >

Yes, we tried to pinpoint the issue through logs, but the logs just didn't exist. image It seems that the preparation of the golden image just never really started, which is exactly what confuses us.

francedot commented 2 days ago

I have seen this happen only when the wrong ISO edition is provided as setup.iso. In this case, the automated preparation step (and corresponding logs) will never start through the unattended file here. Please make sure to pick the correct ISO edition here and start fresh by deleting the src/win-arena-container/vm/storage content before running the preparation step again.