Open rdgreene opened 7 years ago
Leave it in my hands :bowtie: I will clean up my branch, account for the new record formats, and I will start writing and alternative function for Q update; double Q-learning.
Everything is awesome! :cactus:
Excellent! When you've added all the plotting code, do you want to let me know and I'll run some parameter searches?
Also excellent use of emojis, have a π₯
Hi @rdgreene , I fixed the issues by changing scipy functions to numpy as you suggested. I believe the code is ready for merging with master.
have a look :eyes::eyes:
@miguelesteras
I tried running the parameter sweep but I've run into some problems with the the plotting functions not working again! I've temporarily disabled them and added a simple plotting script just to visualise the results so I can do the parameter sweep. I will save the cost arrays so we can plot them properly later! Will push the results tonight...
:octocat: @miguelesteras
I have made some significant changes to our main script and saved a version of it as tsp.py to the main branch.
The first thing you will notice is that I have packaged all the Q-Learning code into a function. This was mostly just to clean up the work space (I got tired of so much scrolling π ), but making it modular like this should also come in handy in the future. I have added this function to the main branch as qLearn.py...
Second thing that should get your attention is the I've nested the function within some loops to run through a parameter search (with n samples run for each parameter combination). This should all be fairly self-explanatory, but let me know if it isn't.
Finally you'll notice that the 'epoch cost' output is of a different form. As we're now running through n samples of m parameter combinations, I've had to do some aggregation. What you'll find is that the script returns a matrix of dimensions e by m, where e is the number of epochs, and m is as stated above. Each column vector in this matrix will essentially represent the average cost for each epoch for a given combination of parameters.
So now to the 'issue' at hand. I propose we move forward with tsp.py as our main script. This will require copying over and making some minor amendments to the plotting code you've already written. Once that's done we can archive the old code to clean things up a bit. Thoughts?
Ps π€