facebookresearch / agenthive

AgentHive provides the primitives and helpers for a seamless usage of robohive within TorchRL.
30 stars 4 forks source link

Clarification on evaluation metric #18

Open gunshi opened 9 months ago

gunshi commented 9 months ago

Hi, Thanks for open-sourcing this framework! I'm trying to reproduce the results of the baselines reported in the Robohive paper, and wanted to ask what is the exact metric that is averaged over 3 seeds in the Franka-expert data runs (here: https://github.com/facebookresearch/agenthive/tree/dev/scripts)? Is it the maximum success rate over a run averaged over 3 seeds or the maximum of the average success rate over 3 seeds or something else? The paper doesn't seem to mention exactly how the success rate of a run is decided (over many checkpoints). Thanks!

ShahRutav commented 8 months ago

We report the average success rate over three seeds x three camera angles (except for Robel Suite where we use all the camera angles). We use the last checkpoint to measure the success rate.