jackyoung96 / RainbowDQN_highway

RainbowDQN algorithm for GYM highway environment
12 stars 2 forks source link

cannot reproduce the experiment results #1

Open xiayangluoxue opened 4 months ago

xiayangluoxue commented 4 months ago

I have a question that I hope you can help me with. I tried to run the model you provided in the environment with the same parameter configuration as described, but the final result is not as good as the video shown in the README file. Can you explain why this might be happening?

Thank you.

jackyoung96 commented 4 months ago

I have a question that I hope you can help me with. I tried to run the model you provided in the environment with the same parameter configuration as described, but the final result is not as good as the video shown in the README file. Can you explain why this might be happening?

Thank you.

Hi, Thank you for your interest and sorry for failure of reproducement Unfortunately, this repository is quite outdated. I think lots of libraries I used before are updated and there might be some differences (I guess)

But if you give me more information such as reward curve, loss curve, result video or something, I can help you.

xiayangluoxue commented 4 months ago

Thank you very much for your response. I recently tried again and finally succeeded in reproducing the results perfectly. Thank you for sharing! Additionally, due to library updates, some code changes were indeed necessary. I have written down the changes I made here so that others can refer to them.

1.Env creation import gym -> import gymnasim as gym

env = gym.make(args.envs) env.configure(highway_config(args.observation_type)) ->

env_config = highway_config(args.observation_type) env_kwargs = {"config": env_config} env = gym.make(args.envs) env = WrapperGym(env)

The WrapperGym is:

class WrapperGym(gym.Wrapper):
    def init(self, env):
        super().init(env)
    def reset(self, **kwargs):
        state, info = self.env.reset(**kwargs)
        return state
    def step(self, action):
        done = False
        state, reward, terminate, truncated, info = self.env.step(action)
        if terminate or truncated:
            done = True
        return state, reward, done, info

2.Dir creation os.path.makedirs() -> os.makedirs()

3.cuda device RuntimeError('Attempting to deserialize object on CUDA device') self.net.load_state_dict(torch.load(os.path.join(path, name))) -> self.net.load_state_dict(torch.load(os.path.join(path, name), map_location={'cuda:3': "cuda:0"}))

4.Agent Agent has no attibute "act_e_greedy", so use "act()" instead.