I just merged in these changes (in PR #1 ) but wanted to still let anyone comment if they have anything to say. Before merging this change, there was only one generate() function.
If anyone is watching this (especially @ebalaban ) I'd appreciate any feedback. I think I am going to switch this interface from having a single generate() function to having a generate_ function for each combination of return values. For example generate_s() would return only a state, while generate_sor() returns a state observation reward tuple.
There are two reasons for this
It was weird for generate(::POMDP, ...) to return (s', o, r) and generate(::MDP, ...) to return only (s', r)
I think this will actually be clearer and more flexible (for example, a solver might just need s and r rather than s, o, and r for a POMDP) for everyone, and it might even be simpler (for example, it seems simpler to just implement generate_s() and reward() for a problem than generate that returns a tuple).
Does this seem reasonable? Does anyone see any potential consequences/difficulties that I missed?
I just merged in these changes (in PR #1 ) but wanted to still let anyone comment if they have anything to say. Before merging this change, there was only one
generate()
function.If anyone is watching this (especially @ebalaban ) I'd appreciate any feedback. I think I am going to switch this interface from having a single generate() function to having a generate_ function for each combination of return values. For example generate_s() would return only a state, while generate_sor() returns a state observation reward tuple.
There are two reasons for this
Does this seem reasonable? Does anyone see any potential consequences/difficulties that I missed?