AI4Finance-Foundation / ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥
https://ai4finance.org
Other
3.65k stars 837 forks source link

Deal with discrete observation space when computing `state_dim` #23

Closed Renovamen closed 3 years ago

Renovamen commented 3 years ago

Hi! Thank you for this awesome library, it helps me a lot.

I'm new to rl and not sure whether I'm missing something, but it seems that ElegantRL doesn't deal with environment with discrete observation space (like Frozenlake, of which observation_space.shape is ()):

https://github.com/AI4Finance-LLC/ElegantRL/blob/d82f33d960c356bebce4178f881d518197686f06/elegantrl/env.py#L198-L199

Do you think it will be better to change here to:

if isinstance(env.observation_space, gym.spaces.Discrete):
    state_dim = env.observation_space.n
elif isinstance(env.observation_space, gym.spaces.Box):
    state_shape = env.observation_space.shape
    state_dim = state_shape[0] if len(state_shape) == 1 else state_shape  # sometimes state_dim is a list

Thanks.

Yonv1943 commented 3 years ago

I went to other city in the past few days, so I won’t reply to you until today.

Only modify the part of the code you listed is not enough, but also add state's One-HOT encoder, or add a Tradinctional RL algorithm.


Some env that is discrete observation space (like Frozenlake) is easy to solve using tradictional reinforcement learning algorithms (like Q-learning). There are not need to use deep reinforcement learning (DRL) algorithms. And ElegantRL is a DRL library.

That is why we didn't add RL algorithms for env with discrete observation space.


Maybe I will add Q-learning to solve 'discrete observation space' env for teaching purposes. I need to concentrate on updating of the multi-GPU version. If there are more people with the same needs as you, then I will add Q-learning algorithms.

Renovamen commented 3 years ago

Hi, sorry for my late reply. Your explanation is clear, thank you very much!

I asked this question that day because I gussed when the state space is discrete but too large, maybe applying a tabular method will not be a good idea. Based on your response and my learning these days, now I know what other operations should be added at least (one-encoding as you say, or even an appropriate representation to avoid the massive dimensionality).

Looking forward to ElegantRL's updates.