PufferAI / PufferLib

Simplifying reinforcement learning for complex game environments
https://pufferai.github.io/
MIT License
1.04k stars 41 forks source link

'Rating' object is not subscriptable #58

Open trangml opened 10 months ago

trangml commented 10 months ago

Found an issue with the OpenSkillRating which causes Wandb logging to fail in the policy ranker.

Problem is here https://github.com/PufferAI/PufferLib/blob/889f172cb27819f771681c91c9b51f8f1e132a17/pufferlib/policy_ranker.py#L90

Exception has occurred: TypeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
'Rating' object is not subscriptable
  File "/home/mtrang/anaconda3/envs/nmmo/lib/python3.9/site-packages/pufferlib/policy_ranker.py", line 90, in update_ranks
    f"skillrank/{wandb_policy}/mu": rating["mu"],
  File "/home/mtrang/Documents/rl/neural_mmo/baselines/reinforcement_learning/clean_pufferl.py", line 394, in evaluate
    self.policy_ranker.update_ranks(
  File "/home/mtrang/anaconda3/envs/nmmo/lib/python3.9/site-packages/pufferlib/utils.py", line 223, in wrapper
    result = func(*args, **kwargs)
  File "/home/mtrang/Documents/rl/neural_mmo/baselines/train_optuna.py", line 66, in objective
    _, stats, infos = trainer.evaluate()
  ...
TypeError: 'Rating' object is not subscriptable``

Simple fix of

                wandb.log({
                    f"skillrank/{wandb_policy}/mu": rating.mu,
                    f"skillrank/{wandb_policy}/sigma": rating.sigma,
                    f"skillrank/{wandb_policy}/score": scores[wandb_policy],
                    "agent_steps": step,
                    "global_step": step,
                })

worked for me.