JuliaPOMDP / DeepQLearning.jl

Implementation of the Deep Q-learning algorithm to solve MDPs
Other
73 stars 13 forks source link

Action masking feature (legal actions) #68

Open filchristou opened 9 months ago

filchristou commented 9 months ago

POMDPs.jl supports state-dependent action spaces

However, DeepQLearning.jl is always picking the full action space. That's because the solve enumerates the actions once here, hands them into the policy, which are broadly used there after.

Do you think of a way to have action masking with the current implementation ?

filchristou commented 9 months ago

hm. or probably give in a customized exploration policy.