LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
305 stars 47 forks source link

Issue with PGMORL logging global_step #106

Closed adamcallaghan0 closed 1 week ago

adamcallaghan0 commented 2 months ago

It does not appear that the "global_step" variable is iterated in the training loop. Should it be incremented by the sum of copied_agents.global_step each iteration.

On completion of a run I am unable to generate eum vs global step plots as there is only one global_step data point (with global step value = 0).

pgmorl

ffelten commented 2 months ago

Hi,

If I remember correctly, global_step is incremented by MOPPO.

ffelten commented 2 months ago

Hi @adamcallaghan0,

I just ran some experiments with PGMORL, it seems you are right. I'll try to fix it along with the gymnasium 1.0 migration

adamcallaghan0 commented 2 months ago

Thank you! In the meantime I have implemented a minor fix on my end to change the warmup block in the train function to: image and slightly further down - image This appears to be working correctly now.

ffelten commented 2 months ago

I have a concern with this right now: self.__train_all_agents trains for all agents in the population (usually 6). So I think we should increment by self.steps_per_iteration * self.nums_envs * len(self.agents)

Now I have to double-check this but it seems that the implementation has been bugged so far, and performances are even worse than what we reported so far.

ffelten commented 2 months ago

Here is what I have so far: https://github.com/LucasAlegre/morl-baselines/blob/chore/update-gymnasium-1.0/morl_baselines/multi_policy/pgmorl/pgmorl.py#L510

adamcallaghan0 commented 2 months ago

hmm this is still generating some very weird plots of step vs global step on wandb...

image

ffelten commented 2 months ago

You can join here: https://discord.gg/ygmkfnBvKA easier to talk