bug fixes for OPE - Githubissues

IntelLabs / coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

https://intellabs.github.io/coach/

Apache License 2.0

2.32k stars 461 forks source link

bug fixes for OPE #311

Closed gal-leibovich closed 5 years ago

gal-leibovich commented 5 years ago

Several bug fixes for Sequential Doubly Robust and Weighted Importance Sampling. Also rename of get_shuffled_data_generator -> get_shuffled_training_data_generator.

Sequential DR had the order of transitions in each episode backwards. Weighted Importance Sampling had some corner cases to be covered.