Closed pranavguru closed 1 month ago
How is OpenVLA evaluated in the original paper?
WidowX 17 tasks across 5 categories:
- Visual generalization (5 tasks)
- Motion generalization (2 tasks)
- Physical generalization (3 tasks)
- Semantic generalization (4 tasks)
- Language grounding (3 tasks)
Google Robot 12 tasks with 5 rollouts each (last 7 tasks out of distribution)
"We fine-tune OpenVLA on 10-100 demonstrations across 7 Franka Emika Panda tasks, ranging from single-instruction tasks to diverse multi-instruction tasks"
How can we adapt it for zero-shot?
Just need to modify
transform.py
anddataset_statistics.json
to add new normalization and action space mapping configs
What are the datasets the original work covers in its evals?
Evaluated on real robots except that the fine-tuned 7B OpenVLA LIBERO model was evaluated on LIBERO using mujoco simulation
What are the model specs during evaluation? Full-scale or quantized?
Full-scale
Answer the following questions: