Open jt70 opened 9 months ago
The same thing happens to me, the truth is I can't find why, I'll have to thoroughly check where the problem is.
@jt70 After a search and tinkering with the framework I found the solution, what happens is that to use dueling you have to change the learner, the actor and the agent, in the folders there are the dqn.py and dueling files, you have to change that, It took me a while to realize the problem because the first thing I did was change the learner but I didn't realize that I also had to change the actor and the agent, if you don't change these it will give you a problem in the learner's update function, also if you use dueling You have to set the dueling parameter to true in the dqn.py file in the examples because otherwise it will also give an error.
from protorl.agents.dueling import DuelingDQNAgent as Agent from protorl.actor.dueling import DuelingDQNActor as Actor from protorl.learner.dueling import DuelingDQNLearner as Learner from protorl.loops.single import EpisodeLoop from protorl.policies.epsilon_greedy import EpsilonGreedyPolicy from protorl.utils.network_utils import make_dqn_networks from protorl.wrappers.common import make_env from protorl.memory.generic import initialize_memory
def main(): env_name = 'CartPole-v1'
use_prioritization = True
use_double = True
use_dueling = True
use_atari = False
layers=[32]
env = make_env(env_name, use_atari=use_atari)
n_games = 1500
bs = 64
# 0.3, 0.5 works okay for cartpole
# 0.25, 0.25 doesn't seem to work
# 0.25, 0.75 doesn't work
memory = initialize_memory(max_size=100_000,
obs_shape=env.observation_space.shape,
batch_size=bs,
n_actions=env.action_space.n,
action_space='discrete',
prioritized=use_prioritization,
alpha=0.3,
beta=0.5
)
policy = EpsilonGreedyPolicy(n_actions=env.action_space.n, eps_dec=1e-4)
q_eval, q_target = make_dqn_networks(env,hidden_layers=layers, use_double=use_double,
use_dueling=use_dueling,
use_atari=use_atari)
dqn_actor = Actor(q_eval, q_target, policy)
q_eval, q_target = make_dqn_networks(env, hidden_layers=layers,use_double=use_double,
use_dueling=use_dueling,
use_atari=use_atari)
dqn_learner = Learner(q_eval, q_target,use_double=use_double,
prioritized=use_prioritization, lr=1e-4)
agent = Agent(dqn_actor, dqn_learner, prioritized=use_prioritization,)
sample_mode = 'prioritized' if use_prioritization else 'uniform'
ep_loop = EpisodeLoop(agent, env, memory, sample_mode=sample_mode,
prioritized=use_prioritization)
scores, steps_array = ep_loop.run(n_games)
if name == 'main': main()``
I changed the parameter in examples/dqn.py to this and I get an error: