cycraig / MP-DQN

Source code for the dissertation: "Multi-Pass Deep Q-Networks for Reinforcement Learning with Parameterised Action Spaces"
MIT License
185 stars 49 forks source link

How to finetune? #1

Open fanbbbb opened 5 years ago

fanbbbb commented 5 years ago

I have trained a model for "soccer pdqn", and I want to finetune a new work based on the trained model, what should I do?

cycraig commented 5 years ago

finetune a new work based on the trained model

Hi, are you trying to do transfer learning to apply the trained model to a similar task in the HFO (soccer) environment, or just optimising the hyperparameters for (M)P-DQN on HFO?

fanbbbb commented 5 years ago

Yes, I have trained a model with 1 offense-agent and 0 defense-npc, and I transfer this model to 1 offense-agent and 1 defense-npc, I modified some layers in pytorch with random initialization. It worked! However, there is a new issue I would like to ask u , I am trying to use this model in a 2 offense-agents and 2 defense-agents work (a multi-agents work), I have no idea about that. Could u please leave me a mail address if u are convenient? Some details need to be consulted, Thanks a lot.

cycraig commented 5 years ago

Nice work 👍

P-DQN in its original form really only considers single, independent agents. There has been some work in multi-agent reinforcement learning with parameterised actions, such as https://arxiv.org/abs/1903.04959 . Your best bet would be to use their algorithm which is designed with multiple agents in mind, or extend P-DQN in a similar fashion.

It's also a bit difficult to transfer models trained using fewer agents on HFO since the state space increases with every agent added to the environment. If you want to discuss this more you can email me at: mpdqn at pm.me