Svalorzen / AI-Toolbox

A C++ framework for MDPs and POMDPs with Python bindings
GNU General Public License v3.0
648 stars 98 forks source link

Add Python GridWorld example #13

Closed aukejw closed 7 years ago

aukejw commented 8 years ago

At the time of writing, it's sometimes difficult to figure out how the generated Python objects work. This is partially due to their hard-to-read boostpython-generated docs:

In [8]: q = MDP.QLearning()
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
<ipython-input-8-de58c583951f> in <module>()
----> 1 q = MDP.QLearning()

ArgumentError: Python argument types in
    QLearning.__init__(QLearning)
did not match C++ signature:
    __init__(_object*, AIToolbox::MDP::SparseModel)
    __init__(_object*, AIToolbox::MDP::SparseModel, double)
    __init__(_object*, AIToolbox::MDP::Model)
    __init__(_object*, AIToolbox::MDP::Model, double)
    __init__(_object*, AIToolbox::MDP::SparseRLModel<AIToolbox::MDP::SparseExperience, void>)
    __init__(_object*, AIToolbox::MDP::SparseRLModel<AIToolbox::MDP::SparseExperience, void>, double)
    __init__(_object*, AIToolbox::MDP::RLModel<AIToolbox::MDP::Experience, void>)
    __init__(_object*, AIToolbox::MDP::RLModel<AIToolbox::MDP::Experience, void>, double)
    __init__(_object*, unsigned long, unsigned long)
    __init__(_object*, unsigned long, unsigned long, double)
    __init__(_object*, unsigned long, unsigned long, double, double)

Adding an example - like the gridworld example from the docs - will make them simpler to understand. This will be significantly less time-consuming than writing docstrings. It would be nice to show how the ValueIterationModel relates to a Model instance, how to properly instantiate them and run the solver.

For example, I'm having trouble constructing the Python equivalent of:

AIToolbox::MDP::ValueIteration solver;
auto solution = solver(world);
Svalorzen commented 8 years ago

Makes sense. I've now set up the basic infrastructure for duplicating all tests in Python, to serve as examples and to check that everything works alright. Between the committed tests there is both one for QLearning, and one for ValueIteration.

In general most C++ methods here now are templated, and depend on the type of the model being solved (this because in C++ in theory you can implement your own model to perform better given that you know how your model should work). This means that the classes cannot by themselves exist in Python naked.

Instead, we create a bunch of their instantiations based on the model they are solving. For example, if you want to use ValueIteration with the Python MDP Model, you'd use MDP.ValueIterationModel. If you want to use the SparseModel, you'd use MDP.ValueIterationSparseModel. Not very pretty, but works out.

For the docs, I've now added documentation to the MDP.Model class. Unfortunately this takes a lot of time since it must be embedded in the C++ code as strings, and it is pretty much always the same as the C++ comments (not always/some additional clarification needed aside). But slowly I hope I'll be able to put all comments in the Python classes too. In the meantime I have removed the C++ docs from the python help which should make what little docs Boost Python has by default a little more readable.

aukejw commented 8 years ago

See this pull request

Svalorzen commented 7 years ago

Closing this as the example was merged.