TheMTank / cups-rl

Customisable Unified Physical Simulations (CUPS) for Reinforcement Learning. Experiments run on the ai2thor environment (http://ai2thor.allenai.org/) e.g. using A3C, RainbowDQN and A3C_GA (Gated Attention multi-modal fusion) for Task-Oriented Language Grounding (tasks specified by natural language instructions) e.g. "Pick up the Cup or else"
http://www.themtank.org
MIT License
48 stars 7 forks source link

Regarding to policy/model/weights #18

Open zyzhang1130 opened 4 years ago

zyzhang1130 commented 4 years ago

Do you mind to clarify is the policy/model/weights saved after each epoch/iteration? If not how should I make it happen? If yes where is it saved at? I see you just call the model 'args.model_path' in agent.py and save(self, path, filename) without specifically assigning a path or name.

Thank you for replying.

zyzhang1130 commented 4 years ago

Hi, I found this piece of info quite relevant! https://github.com/Kaixhin/Rainbow/pull/58 However, I am not proficient to incorporate it into the rainbow code of yours (judging based on the time it was added, your current version rainbow probably does not have this feature). Do you mind to take a look at it?

Thank you so much

zyzhang1130 commented 4 years ago

Actually the contributor of Rainbow says these lines save the model weights: https://github.com/Kaixhin/Rainbow/blob/d3afb5ad570137d675d6c7c903c050c8a19db084/main.py#L179-L181 How do you think I should incorporate it into cups-lr Rainbow code?

Thank you.

beduffy commented 4 years ago

Line 79 in rainbow/test.py which is called periodically from the main.py file is the code which saves the model:

# Save model parameters if improved
if avg_reward > best_avg_reward:
    best_avg_reward = avg_reward
    dqn.save(path='weights', filename='rainbow_{}.pt'.format(num_steps))

This is therefore saved in the rainbow directory from which is related to #17. Set your evaluation-interval lower. If it still doesn't save the model after args.evaluation_interval steps let me know.

zyzhang1130 commented 4 years ago

Sorry, can I check does the current repo allow resume training feature? Thank you.