Where to test it ? does openAI has any environments where transition and reward function are given before hand?
One way to fix this is to get these functions straight from source code, and subclass the environment to create an environment fit for fully model-based solutions
Where to test it ? does openAI has any environments where transition and reward function are given before hand?
One way to fix this is to get these functions straight from source code, and subclass the environment to create an environment fit for fully model-based solutions