camel-ai / crab

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/
https://crab.camel-ai.org/
192 stars 28 forks source link

[Roadmap] Optimize agent stepping efficiency #15

Open dandansamax opened 3 months ago

dandansamax commented 3 months ago

Current agents take at least 20 seconds per step, which is even worse when facing multi-environment. We target to shorten the step time to lower than 10 seconds for two envirnments setting.

dandansamax commented 2 months ago

A test result in cross environment setting:

========================================
Start agent step 0:
2024-09-10 17:35:12,157 DEBUG -- Environment.observe ran in 0.18s with name ubuntu
2024-09-10 17:35:31,640 DEBUG -- Environment.observe_with_prompt ran in 19.66s with name ubuntu
2024-09-10 17:35:32,244 DEBUG -- Environment.observe ran in 0.6s with name android
2024-09-10 17:35:43,427 DEBUG -- Environment.observe_with_prompt ran in 11.79s with name android
2024-09-10 17:35:43,427 DEBUG -- Environment.observe ran in 0.0s with name root
2024-09-10 17:35:43,427 DEBUG -- Environment.observe_with_prompt ran in 0.0s with name root
2024-09-10 17:35:43,427 DEBUG -- Benchmark.observe_with_prompt ran in 31.45s with name ubuntu_android_benchmark
2024-09-10 17:35:54,046 DEBUG -- SingleAgentPolicy.chat ran in 9.61s
So agent take action: [ActionOutput(name='write_text', arguments={'text': 'restaurant around kaust'}, env='ubuntu')]
2024-09-10 17:35:55,911 DEBUG -- Benchmark.step ran in 1.86s with name ubuntu_android_benchmark
Action "write_text" in env "ubuntu" success. current evaluation results: {'total_nodes': 3, 'complete_nodes': 0, 'completeness': 0.0, 'completeness_per_action': 0.0, 'step_to_complete': 2, 'longest_unfinished_path_length': 2}

The time of local deep learning model is the most time cosuming part.