takuseno / d3rlpy

An offline deep reinforcement learning library
https://takuseno.github.io/d3rlpy
MIT License
1.25k stars 227 forks source link

[BUG] Dimension error when trying to fit Probabilistic Ensemble Dynamics model to discrete action dataset #133

Open siomvas opened 2 years ago

siomvas commented 2 years ago

When trying to fit a PED model to any discrete action dataset I get a runtime error: RuntimeError: torch.cat(): Tensors must have same number of dimensions: got 2 and 1

This is caused by the forward method of the encoder, specifically x = torch.cat([x, action], dim=1)

As I understand it, no action-conditioned encoder can be used with discrete actions because of this.

Playing around with it and the pendulum example in the docs I can only deduce it is due to the following behaviour of MDPDataset:

image

as one axis is removed for discrete actions leading to the mismatch.

Is this on purpose?

takuseno commented 2 years ago

@siomvas Hello, thank for the issue. You should set discrete_action=True. https://github.com/takuseno/d3rlpy/blob/747b1747ad3e41eae8a93a8e02ca02c4d9e0ccb0/d3rlpy/dynamics/probabilistic_ensemble_dynamics.py#L90

siomvas commented 2 years ago

@takuseno Thanks for getting back to me, silly mistake! Changing that flag lets training start, but training loss is nan loss=nan and evaluation throws ValueError: The parameter loc has invalid values due to that.

takuseno commented 2 years ago

@siomvas It sounds weird. Could you share a minimum example code to reproduce the issue?