Closed JiahuiZhu666 closed 6 months ago
Hi, if the starting point is infeasible, the policy cannot guarantee safety. In addition, data distribution will also have an impact on offline performance. If you start from a place where the data is sparsely distributed, it may also lead to poor security. We mainly use PointRobot to test the accuracy of feasible region, and have not evaluated the performance in detail. You can collect more data and test it.
Hi, I trained the model and used the same parameters in the train_offline.py for env: PointRobot. But I find the actor performs well at some points but does not show feasible policy at other points. Why does this happen? and how should I elaborate on my results.
Best, Jiahui