Closed edbeeching closed 1 year ago
I believe the problem is with how we get the path to train/enjoy scripts in HF integration code. Specifically:
if cfg.push_to_hub:
enjoy_name = get_top_level_script()
generate_model_card(
experiment_dir(cfg=cfg), cfg.algo, cfg.env, cfg.hf_repository, reward_list, enjoy_name, cfg.train_script
)
push_to_hf(experiment_dir(cfg=cfg), cfg.hf_repository)
def get_top_level_script():
return argv[0].split("sample-factory/")[-1][:-3].replace("/", ".")
I am not 100% sure what get_top_level_script
function does but it obviously breaks in colab. We might need some more reliable mechanism for this. I was hesitant about it during the code review, but it seemed to work locally.
I believe @andrewzhang505 might know more about this!
Yeah, I think the get_top_level_script
is the issue. It is supposed to find the file path of the script being run and split based on the string "sample-factory/" to find the module to run.
I did do some research afterwards to look for alternatives but couldn't find any that seemed to work. If we need a quick fix for the course, I could replace the calls of get_top_level_script to some extra config parameters. Something like:
--enjoy_script=sf_examples.mujoco.enjoy_mujoco --train_script=sf_examples.mujoco.train_mujoco
And we can skip those sections in the model card if those command line args aren't there
@andrewzhang505 thank you for suggestions!
Or we can use placeholder <path.to.train.module>
in the report card if the values are not provided. Very simple solution, getting the actual paths to entry point scripts looks like a headache
I think current implementation is also ought to be replaced just because it relies on the assumption that module path contains sample-factory
substring which might just not be the case. I.e. train script might be in a completely different repo, or SF repo folder can be named differently.
Made a PR for changing the get_top_level_script
to new command line params here: https://github.com/alex-petrenko/sample-factory/pull/256
I think we can close this?
I will try and look at this ASAP, example below.