Closed Svalorzen closed 6 years ago
I thought more about this and I think it's actually more sensible to keep the getExpectedReward(s,a,s1)
signature. It's only the Eigen part that will change, and the fact that our Models will only keep a 2d table for rewards (so getExpectedReward
will return the same thing for all s1
).
This makes easier to define models and shouldn't deprecate old user code, but should still speedup Eigen methods in case somebody wants to use them.
I'm currently implementing this in the GapMin branch. It seems it will also speed the library somewhat, which is always nice.
This has been merged in master.
While it has been all fun and games storing the R(s,a,s) for all models, I think that the time has come to remove this "feature" and just store the R(s,a), which is pretty much equivalent.
This simultaneously reduces the computational costs of pretty much all methods, since they often have to compute R(s,a) anyway, and memory consumption (by a factor of |S|, which is not little).
We should still offer an API in order to create models from a 3D reward function, but internally it would be much much better to stop storing the whole thing.
This will change a bunch of the API for models since they won't offer a s,a,s' access to the reward function anymore, but I think in the long run it's going to be better, considering the performance improvements.
I'm currently finishing a branch, and this will be the next thing I do after that.