RLE-Foundation / RLeXplore

RLeXplore provides stable baselines of exploration methods in reinforcement learning, such as intrinsic curiosity module (ICM), random network distillation (RND) and rewarding impact-driven exploration (RIDE).
https://docs.rllte.dev/
MIT License
348 stars 16 forks source link

Problems with action spaces > 1 in icm #21

Open Croip3 opened 3 weeks ago

Croip3 commented 3 weeks ago

Hi, I'm trying to get RLeXplore running with SB3. All examples work, but if I try with an environment like gyms Ant (https://www.gymlibrary.dev/environments/mujoco/ant/) it crashes with the following error:

` File "/home/longarm_wsl/anaconda3/envs/metaworld3.12/lib/python3.11/site-packages/rllte/xplore/reward/icm.py", line 225, in update im_loss = (im_loss * mask).sum() / th.max(


RuntimeError: The size of tensor a (8) must match the size of tensor b (256) at non-singleton dimension 1`

![image](https://github.com/user-attachments/assets/10a7c2e9-149a-48b2-afce-5ab48564caea)
I used the code from this [example](https://github.com/RLE-Foundation/RLeXplore/blob/main/2%20rlexplore_with_sb3.ipynb) and just changed the environment to 'Ant-v4'. 

I think it has something to do with the action space in continuous environments. I also tried it with the robotics env metaworld and the error (tensor a) matches with the size of the action space. It works fine with the given env Pendulum-v1, Cart-Pole or Mountain-Car-Continuous.

Any idea if this is a bug or maybe an error on my side? I did not find any fix myself yet.
EDIT: The only "fix" I found setting the batch_size = size of action space. E.g. in metaworld die action space is 4 and it works with batch_size = 4. Of course, this is not really a fix and more like a janky workaround.
Croip3 commented 3 weeks ago

I think I found a fix, just tell me if you want to know it.

yuanmingqi commented 3 weeks ago

yes, please submit a PR on rllte, I will mark you as a contributor. Thanks!