Double DQN miserable attempt

theomarzaki commented 5 years ago

@J4BB3R unfortunately I cannot see the branch you are working on,

But I would start of this way: Pull latest master Copy all dueling dqn page into a new page Change the model architecture to fit double dqn Refactor as necessary

Hope this helps :)

turbokadi commented 5 years ago

@theomarzaki I've forgot to push my branch, it's on my way.

theomarzaki commented 5 years ago

Hello @J4BB3R ,

I took a look at the branch, and everything seems to be coming together:

I have fixed the bugs in the file (check commit diff to see the full details) - in summary :

The model needs access to the fields such as learning rate and mini batch size ( adding a parser will really clean up the implementation but you need to assign the values taken from the parser to the model values itself)
The model Is now incorporated in the environment (adding Agent, giving the model the data .... ) so training can occur on the data set
Target Network has been incorporated in training (passed as parameter to stabilise training) -> could and should be changed to see effects (optimality)

Although it is ready to train now: I would take a deeper look at the actual Double DQN architecture itself (I suggest trying something along the lines of Double DQN Pytorch (GitHub) to get started)

Other than that I think all is well :) !

turbokadi commented 5 years ago

Ok I've looked the modifications but I don't understand why you would need values in the model, the parent need those values ? My implementation is not really an implementation because it's just your DQN. I'm currently on an pytorch Double QN example to adapt at our case.

Have a nice day :)

theomarzaki commented 5 years ago

I thought from an OOP perspective it would be a simple case of having all the models in the same file later (after all the research has been completed) , as they all use the same env simulation etc ... for a cleaner struct.

I think it is worth to merge master into your branch as I continuously update the environment methods etc to create a more stable training environment so the models converge faster.

Thank you very much, Likewise !

theomarzaki / TrafficOrchestrator

Double DQN miserable attempt #19