[Bug]: Potential Bug in PPO? Clarification requested

azrael417 commented 2 months ago

🐛 Bug

Hello all,

I am sorry in case this is a false alarm, but should the th.min in

https://github.com/DLR-RM/stable-baselines3/blob/5623d98f9d6bcfd2ab450e850c3f7b090aef5642/stable_baselines3/ppo/ppo.py#L231

not be a th.minimum ? th.min behaves differently from th.minimum. I was reimplementing your implementation of the PPO algo in libtorch and the compiler barfed at that point and I think the compiler is right.

Let me know what you think

Best regards Thorsten

To Reproduce

from stable_baselines3 import ...

Relevant log output / Error message

No response

System Info

No response

Checklist

[X] My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] I have provided a minimal and working example to reproduce the bug
[X] I've used the markdown code blocks for both code and stack traces.

araffin commented 2 months ago

th.min behaves differently from th.minimum.

are you sure?

import torch as th

a = th.ones(2, 4)
b = th.zeros(2, 4)
b[0, 2] = 3

assert th.allclose(th.min(a, b), th.minimum(a,b))

EDIT: if you look at the documentation (https://pytorch.org/docs/stable/generated/torch.min.html), torch.min(input, other, *, out=None) redirects to torch.minimum(). So I think they are actually the same.

azrael417 commented 2 months ago

You are right, in the case where you give it two tensors it seems to be the same. That is good to know. You can close this bug.

DLR-RM / stable-baselines3