Unable to figure out how the rewards are decided when comparing states.

amidos2006 / gym-pcgrl

A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.

MIT License

113 stars 27 forks source link

In the paper, we talked about the goals for each of the problems and the reward function is just a reflection of how close to that goal. For example: if you want to have one region and before doing a certain action the number of regions is 4 and after that action, it became 3. Then the reward will be 1 * scale factor. You can check the actual code about rewards or anything related to the problem such as width or height or whatever in the Problem module in the environment. For example: here is the reward function for the binary: https://github.com/amidos2006/gym-pcgrl/blob/master/gym_pcgrl/envs/probs/binary_prob.py#L98

amidos2006 / gym-pcgrl

Unable to figure out how the rewards are decided when comparing states. #7