Open dandansamax opened 3 months ago
A test result in cross environment setting:
========================================
Start agent step 0:
2024-09-10 17:35:12,157 DEBUG -- Environment.observe ran in 0.18s with name ubuntu
2024-09-10 17:35:31,640 DEBUG -- Environment.observe_with_prompt ran in 19.66s with name ubuntu
2024-09-10 17:35:32,244 DEBUG -- Environment.observe ran in 0.6s with name android
2024-09-10 17:35:43,427 DEBUG -- Environment.observe_with_prompt ran in 11.79s with name android
2024-09-10 17:35:43,427 DEBUG -- Environment.observe ran in 0.0s with name root
2024-09-10 17:35:43,427 DEBUG -- Environment.observe_with_prompt ran in 0.0s with name root
2024-09-10 17:35:43,427 DEBUG -- Benchmark.observe_with_prompt ran in 31.45s with name ubuntu_android_benchmark
2024-09-10 17:35:54,046 DEBUG -- SingleAgentPolicy.chat ran in 9.61s
So agent take action: [ActionOutput(name='write_text', arguments={'text': 'restaurant around kaust'}, env='ubuntu')]
2024-09-10 17:35:55,911 DEBUG -- Benchmark.step ran in 1.86s with name ubuntu_android_benchmark
Action "write_text" in env "ubuntu" success. current evaluation results: {'total_nodes': 3, 'complete_nodes': 0, 'completeness': 0.0, 'completeness_per_action': 0.0, 'step_to_complete': 2, 'longest_unfinished_path_length': 2}
The time of local deep learning model is the most time cosuming part.
Current agents take at least 20 seconds per step, which is even worse when facing multi-environment. We target to shorten the step time to lower than 10 seconds for two envirnments setting.