Hello! When running python llm_rl_scripts/maze/bc/eval_bc.py PARAMS bc_checkpoint_path it outputs json file with statistics for every starting point. Is there any script that provides normalized score or aggregates them properly as it is given in the paper? Same question about other environments. Or are we supposed to extract scores manually?
Hi! Thank you for your question. Yes, we extracted the scores manually. The equations are in appendix E along with the values used to normalize the scores.
Hello! When running
python llm_rl_scripts/maze/bc/eval_bc.py PARAMS bc_checkpoint_path
it outputs json file with statistics for every starting point. Is there any script that provides normalized score or aggregates them properly as it is given in the paper? Same question about other environments. Or are we supposed to extract scores manually?