ManifoldRG / MultiNet

Apache License 2.0
18 stars 1 forks source link

Scope out eval specifications #173

Closed pranavguru closed 1 month ago

pranavguru commented 1 month ago

Answer the following questions:

Locke0 commented 1 month ago

How is OpenVLA evaluated in the original paper?

  1. Full Size 7B OpenVLA model was evaluated on the WidowX robot and Google Robot

    WidowX 17 tasks across 5 categories:

    1. Visual generalization (5 tasks)
    2. Motion generalization (2 tasks)
    3. Physical generalization (3 tasks)
    4. Semantic generalization (4 tasks)
    5. Language grounding (3 tasks)

Google Robot 12 tasks with 5 rollouts each (last 7 tasks out of distribution)

  1. Fine-tuning experiments

"We fine-tune OpenVLA on 10-100 demonstrations across 7 Franka Emika Panda tasks, ranging from single-instruction tasks to diverse multi-instruction tasks"

  1. Fine-tuned OpenVLA 7B LIBERO model evaluated on LIBERO dataset specifically
Locke0 commented 1 month ago

How can we adapt it for zero-shot?

Just need to modify transform.py and dataset_statistics.json to add new normalization and action space mapping configs

What are the datasets the original work covers in its evals?

Evaluated on real robots except that the fine-tuned 7B OpenVLA LIBERO model was evaluated on LIBERO using mujoco simulation

What are the model specs during evaluation? Full-scale or quantized?

Full-scale