h2r / pomdp-py

A framework to build and solve POMDP problems. Documentation: https://h2r.github.io/pomdp-py/
MIT License
209 stars 49 forks source link

A Response and Response model #61

Closed troiwill closed 5 months ago

troiwill commented 5 months ago

This pull request is the first stage of implementing the cost-constrained POMCP (CC-POMCP) algorithm. This algorithm cannot be added to the repository directly because it has additional variables and operations not present in the PO-UCT and POMCP algorithms. One example is cost constraints and their corresponding operations.

To accommodate costs and other future variables, I propose to use a generic model, called a Response model, and a corresponding output, called a response. The name "response" comes from the notion of independent and dependent variables, where a response (reward, cost, etc.) depends on the interaction with the real or simulated environment. Thus, a response model is a wrapper for more specific models, such as reward and cost models (and any others that will follow in the future). By extension, a response is a wrapper for the reward, cost, etc.

The pull request has the following:

  1. Implementation of the ResponseModel and Response classes in the basics files,
  2. Updates to classes in basics to use ResponseModel instead of RewardModel directly,
  3. Updates to the appropriate pre-existing algorithms, including PO-UCT and POMCP,
  4. Updates to the code for the tiger, rocksample, mos, load_unload, and tag problems,
  5. Added test script for response arithmetic operations in test_response.py,
  6. Passes for the test_all.py script,
  7. Passes for the tiger, rocksample, mos, load_unload, and tag problems.
  8. Comments to the ResponseModel and Response classes.