A Response and Response model

This pull request is the first stage of implementing the cost-constrained POMCP (CC-POMCP) algorithm. This algorithm cannot be added to the repository directly because it has additional variables and operations not present in the PO-UCT and POMCP algorithms. One example is cost constraints and their corresponding operations.

To accommodate costs and other future variables, I propose to use a generic model, called a Response model, and a corresponding output, called a response. The name "response" comes from the notion of independent and dependent variables, where a response (reward, cost, etc.) depends on the interaction with the real or simulated environment. Thus, a response model is a wrapper for more specific models, such as reward and cost models (and any others that will follow in the future). By extension, a response is a wrapper for the reward, cost, etc.

The pull request has the following:

Implementation of the ResponseModel and Response classes in the basics files,
Updates to classes in basics to use ResponseModel instead of RewardModel directly,
Updates to the appropriate pre-existing algorithms, including PO-UCT and POMCP,
Updates to the code for the tiger, rocksample, mos, load_unload, and tag problems,
Added test script for response arithmetic operations in test_response.py,
Passes for the test_all.py script,
Passes for the tiger, rocksample, mos, load_unload, and tag problems.
Comments to the ResponseModel and Response classes.

h2r / pomdp-py

A Response and Response model #61