Open LorenzoBonanni opened 1 year ago
If you use FORollout
it should use the state-based action space.
If you use PORollout
, by default it uses the NothingUpdater
which doesn't pass any information in the belief to limit the action space. You can use a different updater with PORollout
if you want to still use PORollout
and get only the legal actions based on the belief.
I've tried using PORollout
with DiscreteUpdater
but it gives me error because it misses the extract_belief
function which is available only for NothingUpdater
and PreviousObservationUpdater
.
here is the code I used:
using RockSample
using POMDPs
using POMDPTools
using BasicPOMCP
using Random
using ParticleFilters
rocks = [[10, 1], [11, 6], [9, 6], [2, 5]]
const n_particle = 32768 # 2^15
rand_noise_generator_for_sim = MersenneTwister(2980164632)
rand_noise_generator_seed_for_planner = MersenneTwister(941564507)
env = RockSamplePOMDP{4}(
map_size=(12, 12),
rocks_positions=rocks,
sensor_efficiency=20.0,
discount_factor=0.95,
good_rock_reward=10.0,
bad_rock_penalty=-10.0,
sensor_use_penalty=0.0,
step_penalty=0.0
)
pf = UnweightedParticleFilter(env, n_particle, rand_noise_generator_for_sim)
solver = POMCPSolver(
estimate_value=PORollout(
RandomSolver(
rand_noise_generator_for_sim
),
DiscreteUpdater(env)
),
max_depth=100,
c=1.0,
tree_queries=n_particle,
rng=rand_noise_generator_seed_for_planner
)
policy = solve(solver, env)
ib = initialstate(env)
a, ai = action_info(policy, ib)
Got it. Is there a reason you can't use FORollout? It may be possible to use PORollout, but you will have to write some more code.
No there isn't I was just playing around.
When doing Random Rollout the action is selected among all actions of the environment without considering only legal actions.