LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
271 stars 44 forks source link

Bug in PCN due to logging #85

Closed wilrop closed 8 months ago

wilrop commented 8 months ago

Hi, it seems I introduced a bug in PCN when extending the logger to also log the training parameters. I will open a pull request in a minute with a quick fix but I wanted to have this issue as a reference.

The problem starts at the following line that initialises a parameter with type np.ndarray to a float.

max_return: np.ndarray = 100.0

When we want to log this parameter, it is assumed that max_return is indeed an array and that tolist() is available. However, since it is initialised to a float, this raises an error at the following line:

"max_return": max_return.tolist(),
wilrop commented 8 months ago

https://github.com/LucasAlegre/morl-baselines/pull/86