Open Kaixhin opened 7 years ago
I gave it a shot, however I am not sure how the discounted reward R is supposed to be used and I also need to check if future and past k-transitions are valid
Awesome - I'll try and have a look soon or next week! Would you be able to test it to try and replicate one of the results from the paper?
I started on this myself as well, so will see how our implementations compare.
Hi, have you reproduced that optimality tightening results? I have tried some games based on tensorflow and openai gym but the results seem much worse than the papers' results. I am not sure whether I misunderstand something or miss some tricks in the paper. It seems that the paper doesn't include everything about their works.
Does anyone know wether they have published the source code for optimal tightening, from the paper?
No, they haven't published their code as far as I know. The tricks they use are not hard to implement but I can not still achieve their performance.
I have tried implementing optimality tightening (see earlier post) but the results I get are also much worse than the paper's.
In my experience the smallest details in a paper can be key to reproducing results - and these may be missing or ambiguous. If anyone is reasonably confident in their implementation, you should try contacting one of the authors with specific questions.
Hi guys, I have released the code at https://github.com/ShibiHe/Q-Optimality-Tightening. Please have a look.
Best, Shibi
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening potentially speeds up Q-learning by an order of magnitude! Apparently not too hard to implement either.