DigiRL-agent / digirl

Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
Apache License 2.0
269 stars 21 forks source link

[Question] How UI trajectories were created from AITW dataset #20

Closed Mohamed209 closed 1 month ago

Mohamed209 commented 2 months ago

per my understanding you processed AITW dataset categories into **-sft-trajectories.pt archives to be used during training the agent , one point is this processing scripts may be not found in the repo , so it is not clear how those archives are created from the original AITW data for example general-off2on-sft-trajectories.pt second question : why split agents per category eg [general,web] , just point of view is to have one powerful agent trained on both tasks , for example auto-ui vlm was trained on all categories not separately one model for every category

BiEchi commented 2 months ago

Thanks for your interest in our work.

  1. These are trajectories that you collect with only the raw AutoUI checkpoint, not trajectories during training the agent.
  2. This is to gain insights from how the agent work on different sets, you can also try training them jointly. To do this, simply merge the task files text.
Mohamed209 commented 2 months ago

sorry answer 1 still not clear , in attached method offline data path for example general-offline-sft-trajectories.pt , if i want to train the agent jointly on all tasks , I assume i need to pass for example all_tasks-offline-sft-trajectories.pt , but I can not see the scripts to create such archives -here or in auo-ui repo- so that i can modify to generate file .pt with trajectories from all tasks so simply would you mind share more details how these trajectories were created and their naming conventions https://drive.google.com/drive/folders/1ud1XyzCfh0257CixxdgLjjpX59jYbhfU Thanks in advance

Screenshot from 2024-10-01 15-34-58

BiEchi commented 1 month ago

Thanks for following this up. To collect trajectories using AutoUI, you should use the eval_only.yaml config, and create a new txt file by concating webshop_train.txt and general_train.txt. You should specify the path to the new task set in the config file to make the environment sample from tasks within this new file.

BiEchi commented 1 month ago

Closing due to inactivity.