Understanding the schema and format of the control datasets so as to craft optimal prompts for zero-shot and fine-tuned eval of closed source VLMs on control tasks.
https://arxiv.org/pdf/1910.10897 (The action space is a 2-tuple consisting of the change in 3D space of the end-effector
followed by a normalized torque that the gripper fingers should apply. The actions in this space range
between −1 and 1.)
Understanding the schema and format of the control datasets so as to craft optimal prompts for zero-shot and fine-tuned eval of closed source VLMs on control tasks.