JuliaPOMDP / BasicPOMCP.jl

The PO-UCT algorithm (aka POMCP) implemented in Julia
Other
35 stars 17 forks source link

Legal Actions during Rollout #35

Open LorenzoBonanni opened 1 year ago

LorenzoBonanni commented 1 year ago

When doing Random Rollout the action is selected among all actions of the environment without considering only legal actions.

zsunberg commented 1 year ago

If you use FORollout it should use the state-based action space.

If you use PORollout, by default it uses the NothingUpdater which doesn't pass any information in the belief to limit the action space. You can use a different updater with PORollout if you want to still use PORollout and get only the legal actions based on the belief.

LorenzoBonanni commented 1 year ago

I've tried using PORollout with DiscreteUpdater but it gives me error because it misses the extract_belief function which is available only for NothingUpdater and PreviousObservationUpdater.

here is the code I used:

using RockSample
using POMDPs
using POMDPTools
using BasicPOMCP
using Random
using ParticleFilters

rocks = [[10, 1], [11, 6], [9, 6], [2, 5]]
const n_particle = 32768 # 2^15
rand_noise_generator_for_sim = MersenneTwister(2980164632)
rand_noise_generator_seed_for_planner = MersenneTwister(941564507)
env = RockSamplePOMDP{4}(
        map_size=(12, 12),
        rocks_positions=rocks,
        sensor_efficiency=20.0,
        discount_factor=0.95,
        good_rock_reward=10.0,
        bad_rock_penalty=-10.0,
        sensor_use_penalty=0.0,
        step_penalty=0.0
    )
pf = UnweightedParticleFilter(env, n_particle, rand_noise_generator_for_sim)

solver = POMCPSolver(
    estimate_value=PORollout(
        RandomSolver( 
            rand_noise_generator_for_sim
        ),
        DiscreteUpdater(env)
    ),
    max_depth=100,
    c=1.0,
    tree_queries=n_particle,
    rng=rand_noise_generator_seed_for_planner
)
policy = solve(solver, env)
ib = initialstate(env)
a, ai = action_info(policy, ib)
zsunberg commented 1 year ago

Got it. Is there a reason you can't use FORollout? It may be possible to use PORollout, but you will have to write some more code.

LorenzoBonanni commented 1 year ago

No there isn't I was just playing around.