alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
773 stars 107 forks source link

small bug in returning multi-agent enjoy #242

Closed andrewzhang505 closed 1 year ago

andrewzhang505 commented 1 year ago

IsaacGym AllegroHand throws an error when returning the mean rewards in enjoy:

File "/home/andrewzhang/sample-factory/sample_factory/enjoy.py", line 269, in enjoy
    return ExperimentStatus.SUCCESS, float(np.mean(episode_rewards))
TypeError: only size-1 arrays can be converted to Python scalars

Seems like this happens if some agents do a different number of episodes than other agents. I changed it to do the same thing as logging.