small bug in returning multi-agent enjoy

IsaacGym AllegroHand throws an error when returning the mean rewards in enjoy:

File "/home/andrewzhang/sample-factory/sample_factory/enjoy.py", line 269, in enjoy
    return ExperimentStatus.SUCCESS, float(np.mean(episode_rewards))
TypeError: only size-1 arrays can be converted to Python scalars

Seems like this happens if some agents do a different number of episodes than other agents. I changed it to do the same thing as logging.

alex-petrenko / sample-factory

small bug in returning multi-agent enjoy #242