Closed adamcallaghan0 closed 1 week ago
Hi,
If I remember correctly, global_step
is incremented by MOPPO.
Hi @adamcallaghan0,
I just ran some experiments with PGMORL, it seems you are right. I'll try to fix it along with the gymnasium 1.0 migration
Thank you! In the meantime I have implemented a minor fix on my end to change the warmup block in the train function to: and slightly further down - This appears to be working correctly now.
I have a concern with this right now:
self.__train_all_agents
trains for all agents in the population (usually 6).
So I think we should increment by self.steps_per_iteration * self.nums_envs * len(self.agents)
Now I have to double-check this but it seems that the implementation has been bugged so far, and performances are even worse than what we reported so far.
hmm this is still generating some very weird plots of step vs global step on wandb...
You can join here: https://discord.gg/ygmkfnBvKA easier to talk
It does not appear that the "global_step" variable is iterated in the training loop. Should it be incremented by the sum of copied_agents.global_step each iteration.
On completion of a run I am unable to generate eum vs global step plots as there is only one global_step data point (with global step value = 0).