hoangminhle / hierarchical_IL_RL

Code for hierarchical imitation learning and reinforcement learning
282 stars 73 forks source link

Inconsistency between the models for training and testing. #3

Open SamitHuang opened 5 years ago

SamitHuang commented 5 years ago

I am re-implementing your interesting work. I have some problems on the Montezuma's Revenge task. During training, in run_hybrid_atari_experiment.py, you used Hdqn(GPU) as the subgoal network, but for testing, in test_model.py, you used another network architecture Net() as the subgoal network. Why are they not consistent? Could you please upload the trained weights and the code for using Hdapn(GPU) in testing?

Also, I notice that in testing, the trained meta controller is actually not used. Instead, the subgoals are manually set and each subgoal is achieved by a simple_net, which seems not surpassing a supervised method that using imitation learning to learn to achieve each fixed subgoal under a fixed environment. Could you explain the generalizability of the method? Thanks!

hoangminhle commented 4 years ago

Sorry for the late reply. Not sure if I'm missing something from your question, but I'm pretty sure the trained meta controller was used using testing. I'll double check and may upload the network weights once I find it