ucl-dark / pax

Scalable Opponent Shaping Experiments in JAX
Apache License 2.0
21 stars 5 forks source link

Combine Evaluation Runners #110

Closed newtonkwan closed 2 years ago

newtonkwan commented 2 years ago

Summary

Combines evaluation into a single runner. To evaluate a model, specify the model path in the .yaml file and set runner: eval.

Background

Objective

Changes

TODO

Upcoming PR

Example

Evaluating IPD python -m pax.experiment +experiment/ipd=earl_v_tabular ++wandb.log=True ++num_envs=1 ++num_devices=1 ++num_steps=100 ++num_inner_steps=100 ++total_timesteps=10000 ++wandb.name="testing_delete_me" ++runner=eval

Evaluating IMP python -m pax.experiment +experiment/mp=earl_v_tabular ++wandb.log=True ++num_envs=1 ++num_devices=1 ++num_steps=100 ++num_inner_steps=100 ++total_timesteps=10000 ++seed=0 ++wandb.name="testing_delete_me" ++runner=eval

Evaluating Coin Game python -m pax.experiment +experiment/cg=earl_v_ppo_memory ++wandb.log=True ++num_envs=100 ++num_devices=1 ++num_steps=16 ++total_timesteps=9600 ++seed=0 ++wandb.name="testing_delete_me" ++runner=eval

akbir commented 2 years ago

sorry so to be clear - eval now must only run with pre_trained agents?

I'm not sure we should tie that logic together. Surely there is a use case where I want a pre-trained agent to be used in other agents training?

akbir commented 2 years ago

I think I'd prepose moving the load logic into the pre_trained agents in experiment.py and then use the runners as you have set them.

Does the EvalRunner have the same logging as what you had in both previous eval_x runners?

akbir commented 2 years ago

I also think - just for house keeping we should keep all the runner method calls the same.

newtonkwan commented 2 years ago

sorry so to be clear - eval now must only run with pre_trained agents?

Yes. Below comments for my take on adding back agent1: [agent1]_pre_trained.

I'm not sure we should tie that logic together. Surely there is a use case where I want a pre-trained agent to be used in > other agents training?

Sure. I think we can add this back in if there is a potential use case for it. I didn't think we would need pre_train agents anymore and should have asked first. Green light to separate them i.e. adding agent1: [agent1]_pretrained back in and keeping runner: eval.

I think I'd prepose moving the load logic into the pre_trained agents in experiment.py and then use the runners as you have set them.

Sure. Adding the loading into the experiment would be fine. It would require adding back in agent1: [agent1]_pretrained as an agent.

Does the EvalRunner have the same logging as what you had in both previous eval_x runners?

Yes. In the paper, I used what is essentially EvalRunner for Coin Game. I determined that EvalRunner also gives the same logging for IPD/IMP.

I also think - just for house keeping we should keep all the runner method calls the same.

This can be done. This requires either initializing all of the relevant variables before calling any runners, then passing them into every runner, even if that runner doesn't use that variable. For example, you'd need to initialize param_reshaper outside of the if runner = evo conditional, then pass that into both the rl and eval runner, even though neither of them will use it. Totally fine with it if that is a better way of doing it. An alternative, which I think is worse, is to initialize that stuff inside the evo_runner.