ZhengYinan-AIR / FISOR

[ICLR 2024] The official implementation of "Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model"
https://zhengyinan-air.github.io/FISOR/
64 stars 4 forks source link

Actor issue #2

Closed JiahuiZhu666 closed 6 months ago

JiahuiZhu666 commented 6 months ago

Hi, I trained the model and used the same parameters in the train_offline.py for env: PointRobot. But I find the actor performs well at some points but does not show feasible policy at other points. Why does this happen? and how should I elaborate on my results.

Best, Jiahui

ZhengYinan-AIR commented 6 months ago

Hi, if the starting point is infeasible, the policy cannot guarantee safety. In addition, data distribution will also have an impact on offline performance. If you start from a place where the data is sparsely distributed, it may also lead to poor security. We mainly use PointRobot to test the accuracy of feasible region, and have not evaluated the performance in detail. You can collect more data and test it.