Hi @kka0009 !
In general, v_min, v_max for Categorical DQN is set depends on reward range of environment.
The reward of Acrobot-v1 is -1 or 0 according to gym code, so it would be better to set v_min, v_max as -1, 0.
For atom_size, 51 is usually the best value proposed in the paper, but if reward range is too big, you should set it bigger.
However this is just a guideline, and you should choose them through many experiments to make the best agent.
Hi @kka0009 ! In general,
v_min
,v_max
for Categorical DQN is set depends on reward range of environment. The reward of Acrobot-v1 is -1 or 0 according to gym code, so it would be better to setv_min
,v_max
as -1, 0. Foratom_size
,51
is usually the best value proposed in the paper, but if reward range is too big, you should set it bigger.However this is just a guideline, and you should choose them through many experiments to make the best agent.