new setup for experiment4

I have changed setup for rerun of experiment4. What is added:

Now I have fixed problem with our implmentation that we can't specify epsilon decay for every task, so I firstly train agent on first task with appropriate epsilon decay and then I load him to second task overwriting his epsilon decay to value which matches current epsilon reset.
Rest of files (I mean trained agents on first task and appropriate formats of meta files) I will send to Maciej who is responsible for eval impact on other platform to not clutter our repo

krezelj / academia