alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
773 stars 106 forks source link

HF hub model card shows incorrect text when used in colab #255

Closed edbeeching closed 1 year ago

edbeeching commented 1 year ago

I will try and look at this ASAP, example below.

image

alex-petrenko commented 1 year ago

I believe the problem is with how we get the path to train/enjoy scripts in HF integration code. Specifically:

    if cfg.push_to_hub:
        enjoy_name = get_top_level_script()
        generate_model_card(
            experiment_dir(cfg=cfg), cfg.algo, cfg.env, cfg.hf_repository, reward_list, enjoy_name, cfg.train_script
        )
        push_to_hf(experiment_dir(cfg=cfg), cfg.hf_repository)
def get_top_level_script():
    return argv[0].split("sample-factory/")[-1][:-3].replace("/", ".")

I am not 100% sure what get_top_level_script function does but it obviously breaks in colab. We might need some more reliable mechanism for this. I was hesitant about it during the code review, but it seemed to work locally.

I believe @andrewzhang505 might know more about this!

andrewzhang505 commented 1 year ago

Yeah, I think the get_top_level_script is the issue. It is supposed to find the file path of the script being run and split based on the string "sample-factory/" to find the module to run.

I did do some research afterwards to look for alternatives but couldn't find any that seemed to work. If we need a quick fix for the course, I could replace the calls of get_top_level_script to some extra config parameters. Something like:

--enjoy_script=sf_examples.mujoco.enjoy_mujoco --train_script=sf_examples.mujoco.train_mujoco

And we can skip those sections in the model card if those command line args aren't there

alex-petrenko commented 1 year ago

@andrewzhang505 thank you for suggestions! Or we can use placeholder <path.to.train.module> in the report card if the values are not provided. Very simple solution, getting the actual paths to entry point scripts looks like a headache

alex-petrenko commented 1 year ago

I think current implementation is also ought to be replaced just because it relies on the assumption that module path contains sample-factory substring which might just not be the case. I.e. train script might be in a completely different repo, or SF repo folder can be named differently.

andrewzhang505 commented 1 year ago

Made a PR for changing the get_top_level_script to new command line params here: https://github.com/alex-petrenko/sample-factory/pull/256

alex-petrenko commented 1 year ago

I think we can close this?