JuliaPOMDP / RockSample.jl

Implement the rock sample problem using POMDPs.jl
Other
5 stars 5 forks source link

Speed up stateindex and implement solvers for solving for MDP and QMDP solutions fastly #10

Closed Wu-Chenyang closed 3 years ago

Wu-Chenyang commented 3 years ago

When using an MDP or QMDP solution as a heuristic in online planning, states need to be transformed into indexes with stateindex function. However, this function previously runs slow, because LinearIndices requires dynamic dispatch. I managed to speed it up but in an inelegant way. Is there a better way to do this? Discrete value iteration requires several days to solve an MDP solution for RS(15,15), which is partly caused by the low efficiency of stateindex. I provided a DP algorithm for solving the optimal utility. Now the MDP solution for RS(15,15) can be solved in 10s. In addition, a heuristic RSExit is provided, which is essentially the same as the fixed action policy move east but avoids senseless rollouts.

codecov[bot] commented 3 years ago

Codecov Report

Merging #10 (6a114bc) into master (6b2dc03) will decrease coverage by 0.59%. The diff coverage is 86.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #10      +/-   ##
==========================================
- Coverage   86.46%   85.86%   -0.60%     
==========================================
  Files           7        8       +1     
  Lines         133      184      +51     
==========================================
+ Hits          115      158      +43     
- Misses         18       26       +8     
Impacted Files Coverage Δ
src/reward.jl 100.00% <ø> (ø)
src/transition.jl 90.00% <ø> (ø)
src/heuristics.jl 82.60% <82.60%> (ø)
src/RockSample.jl 100.00% <100.00%> (ø)
src/actions.jl 100.00% <100.00%> (ø)
src/observations.jl 100.00% <100.00%> (ø)
src/states.jl 100.00% <100.00%> (ø)
src/visualization.jl 72.88% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6b2dc03...e601f49. Read the comment docs.

zsunberg commented 3 years ago

Thanks! This looks good to me! @MaximeBouton do you have any comments?

One question: what does rs_util do? I'm assuming it calculates the value for the underlying MDP. If so, could you rename it to something that makes it more clear that it is the util for the fully observable mdp like rs_mdp_util? Also adding a short docstring would be helpful :)

MaximeBouton commented 3 years ago

Taking a look right now!

MaximeBouton commented 3 years ago

Thank you for the contribution, the solution for speeding up the stateindex function looks fine. You added a lot of other features: heuristics, rsgen, rsutil. It would be nice to at least add docstrings for each of them. Also see my comment on potentially renaming rsgen.

MaximeBouton commented 3 years ago

When I used this repo, I always used the SparseValueIteration solver which was rather fast, but of course now the conversion to a sparse tabular pomdp will be even faster ;).

Wu-Chenyang commented 3 years ago

I've added some docstrings and renamed rsgen and rs_util. Is there something that is still not clear?

Wu-Chenyang commented 3 years ago

On the SparseValueIteration, I should have known this before.😢 It is indeed significantly faster.

zsunberg commented 3 years ago

I think it looks good. Thanks again for the contribution! One note: It is better to use strings for documentation rather than comments because that can hook into the built in Julia help system or Documenter.jl. I made suggestions above.

@MaximeBouton I'll let you merge if you are satisfied.

MaximeBouton commented 3 years ago

Looks great, I will merge

Wu-Chenyang commented 3 years ago

I realized there is a mistake in [compat]. The version of StaticArrays can also be 0.12.

Wu-Chenyang commented 3 years ago

I thought it was me, but it wasn't. Maybe I should comment in #11 ?