Open siomvas opened 2 years ago
@siomvas Hello, thank for the issue. You should set discrete_action=True
.
https://github.com/takuseno/d3rlpy/blob/747b1747ad3e41eae8a93a8e02ca02c4d9e0ccb0/d3rlpy/dynamics/probabilistic_ensemble_dynamics.py#L90
@takuseno Thanks for getting back to me, silly mistake! Changing that flag lets training start, but training loss is nan loss=nan
and evaluation throws ValueError: The parameter loc has invalid values
due to that.
@siomvas It sounds weird. Could you share a minimum example code to reproduce the issue?
When trying to fit a PED model to any discrete action dataset I get a runtime error:
RuntimeError: torch.cat(): Tensors must have same number of dimensions: got 2 and 1
This is caused by the
forward
method of the encoder, specificallyx = torch.cat([x, action], dim=1)
As I understand it, no action-conditioned encoder can be used with discrete actions because of this.
Playing around with it and the pendulum example in the docs I can only deduce it is due to the following behaviour of MDPDataset:
as one axis is removed for discrete actions leading to the mismatch.
Is this on purpose?