liuzuxin / DSRL

🔥 Datasets and env wrappers for offline safe reinforcement learning
https://offline-saferl.org
Apache License 2.0
65 stars 4 forks source link

How to determine the number of trajectories under different settings in a task #6

Closed ZhaoRunyi closed 1 month ago

ZhaoRunyi commented 4 months ago

Hi! I' ve read your paper and it's great work. But it seems that though in the paper you records the number of trajectories under different setting in a task (for instance in the SafetyPointGoal task, hard constarint setting SafetyPointGoal1-v0 has 2022 trajecories while soft constraint setting SafetyPointGoal2-v0 has 3442) and claims that this is the number of trajectories after applying the density filter, it's still unclear that:

  1. How many raw trajectories you generated at first for each setting
  2. What's the ratio of the trajectories generated by different methods(BC-All, BC-Safe, etc), it seems that this is a key hyperparameters many related works focus on.

Thank you for answering my questions in advance and I'm looking forward to your reply.

liuzuxin commented 3 months ago

Hi @ZhaoRunyi , unfortunately, we did not have the number of raw trajectories because we do the filtering during the data collection process. otherwise, it would be a huge storage and memory consumption when storing the entire dataset after one RL training is done. The ratios of different methods are not quite important, but rather, the cost thresholds would be more important. This is because the goal is to get diverse trajectories that could cover the reward-cost plot as large as possible, so the algorithm itself that collects these trajectories is not quite important.