agi-brain / xuance

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library
https://xuance.readthedocs.io/
MIT License
605 stars 100 forks source link

ppo算法离散情况(policy采用Categorical_AC)多维离散动作的替代 #55

Closed HawkQ closed 1 month ago

HawkQ commented 1 month ago

目前ppo算法使用Categorical_AC时,如果action是二维,action_space采用MultiDiscrete定义,会回报 AttributeError: 'MultiDiscrete' object has no attribute 'n' 查看报错是因为需要通过action_space.n获取action_dim,但MultiDiscrete没有n故报错 我的动作空间是二维离散,请问有其他方式替代吗? 感谢!

wenzhangliu commented 1 month ago

非常抱歉,玄策暂时还不支持MultiDiscrete,可以考虑将二维离散动作编码为一维离散动作再作处理。