Denys88 / rl_games

RL implementations
MIT License
851 stars 143 forks source link

Fix: TypeError: clamp(): argument 'min' (position 2) must be Number, … #142

Closed yuemingl closed 2 years ago

yuemingl commented 2 years ago

…not Tensor

Function clamp() requires a Number in the second parameter, however, curr_e_clip is a Tensor.

Denys88 commented 2 years ago

Ill doublecheck that if works with soft_clamp parameter and merge it. Thanks.

Denys88 commented 2 years ago

@yuemingl could you provide a test case how to reproduce the issue?

yuemingl commented 2 years ago

change clip_value to True in rl_games/configs/ppo_walker.yaml and run python runner.py --trian --file rl_games/configs/ppo_walker.yaml. The following error message will be reproduced: TypeError: clamp(): argument 'min' (position 1) must be Number, not Tensor

Denys88 commented 2 years ago

Surprisingly it works for me in current master. I still cannot reproduce it. But I updated to the pytorch version 1.10 could it be the reason?

yuemingl commented 2 years ago

I found that there are a couple of discussions about extending the max/min parameters of clamp() to Tensor, like https://github.com/pytorch/pytorch/pull/32587 https://github.com/pytorch/pytorch/issues/2793

The latest document of pyTorch supports Tensor as the max/min argument of clamp(), check it here https://pytorch.org/docs/stable/generated/torch.clamp.html

According to the configure files rl_games/configs/*.yaml, the clip_value is a float number. The usages of clamp() function in the code also assume it a float number. For example, torch.clamp(ratio, 1.0 - curr_e_clip, 1.0 + curr_e_clip). Unless there is a plan to extend it to support clamping by multi-values, I guess it's better to fix it to make the code compatible with older version of pyTorch.

Denys88 commented 2 years ago

@yuemingl I found what was wrong. I didn't cleanup well my old tests where I made e_clip trainable. Ill roll it back. No need to make tensor from the e_clip. Thanks for report!

Denys88 commented 2 years ago

I fixed it here: https://github.com/Denys88/rl_games/pull/145