Closed vmoens closed 1 month ago
Note: Links to docs will display an error until the docs builds have been completed.
As of commit caa258f9e5c8d22571d49bcdddbbe68b81d353d4 with merge base 726e95955009c73dc0242424182222e59a9056d7 ():
* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940351256) ([gh](https://github.com/pytorch/rl/actions/runs/9416665747/job/25940351256)) `RuntimeError: Command docker exec -t bd9b40a89edf1030bfb9fa47d73d81cd2323326f53d7ab9a303f5d9b850d2d11 /exec failed with exit code 1` * [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940293828) ([gh](https://github.com/pytorch/rl/actions/runs/9416665746/job/25940293828)) `RuntimeError: Command docker exec -t 385f35062da772aa27699b3a16718b96fe1d99318dcc6cc84d6f6955b3c1a1e2 /exec failed with exit code 1` * [Libs Tests on Linux / unittests-sklearn (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940294029) ([gh](https://github.com/pytorch/rl/actions/runs/9416665746/job/25940294029)) `RuntimeError: Command docker exec -t 2ce6862202943d37144a99377297a686139fead1154668847367d9096482dd23 /exec failed with exit code 1` * [RLHF Tests on Linux / unittests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940307777) ([gh](https://github.com/pytorch/rl/actions/runs/9416665750/job/25940307777)) `RuntimeError: Command docker exec -t 659b77bfcaa7450ba47a9aca3dcd6d5226f3fa4a21385f40e89dc76fab1ee83e /exec failed with exit code 1` * [Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940358798) ([gh](https://github.com/pytorch/rl/actions/runs/9416665739/job/25940358798)) `RuntimeError: Command docker exec -t f456252507690dea75b915ee7addfc29e39362dd61da55c82a7fc37506654739 /exec failed with exit code 1` * [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940269101) ([gh](https://github.com/pytorch/rl/actions/runs/9416665748/job/25940269101)) `The process 'C:\Program Files\Git\cmd\git.exe' failed with exit code 128` * [Wheels / test-wheel (linux, ubuntu-20.04, 3.10)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940348592) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940348592)) * [Wheels / test-wheel (linux, ubuntu-20.04, 3.11)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940348824) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940348824)) `##[error]The operation was canceled.` * [Wheels / test-wheel (linux, ubuntu-20.04, 3.8)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940347960) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940347960)) `##[error]The operation was canceled.` * [Wheels / test-wheel (linux, ubuntu-20.04, 3.9)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940348242) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940348242)) `ModuleNotFoundError: No module named 'dm_env'` * [Wheels / test-wheel-windows (3.11)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940447162) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940447162)) `ModuleNotFoundError: No module named 'dm_env'` * [Wheels / test-wheel-windows (3.8)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940446452) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940446452))
* [Lint / c-source / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940268266) ([gh](https://github.com/pytorch/rl/actions/runs/9416665743/job/25940268266)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Lint / python-source-and-configs / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940267980) ([gh](https://github.com/pytorch/rl/actions/runs/9416665743/job/25940267980)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Wheels / test-wheel-windows (3.10)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940446922) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940446922)) (matched **win** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `##[error]The operation was canceled.` * [Wheels / test-wheel-windows (3.9)](https://hud.pytorch.org/pr/pytorch/rl/2202#25940446686) ([gh](https://github.com/pytorch/rl/actions/runs/9416665718/job/25940446686)) (matched **win** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `##[error]The operation was canceled.`
👉 Rebase onto the `viable/strict` branch to avoid these failures
* [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2202#25940359083) ([gh](https://github.com/pytorch/rl/actions/runs/9416665739/job/25940359083)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/726e95955009c73dc0242424182222e59a9056d7#25936887556)) `test/test_transforms.py::TestVecNorm::test_state_dict_vecnorm`
This comment was automatically generated by Dr. CI and updates every 15 minutes.
@wertyuilife2
Should we reduce the priorities of each traj while we're at it? I don't think it'd require much compute and it would make sure that all items are equally weighted within a traj
Take the following 2 trajs with associated priorities
Item: [0, 1, 2, 3, 4, 5, 6, 7] Traj: [0, 0, 0, 1, 1, 1, 1, 1] Priority: [10, 1, 1, 10, 1, 2, 1, 1]
Currently, item 0 and 3 having a higher priority they have more chances of being sampled as start points and hence you will get more trajs starting with these. If we reduce, we will have (10 + 1 + 1)/3=4 for the first and (10 + 1 + 2 + 1 +1)/5=3 for the second.
Priority: [4, 4, 4, 3, 3, 3, 3, 3]
At this point, the start point is equally likely within a traj but some trajs have a higher prob of being sampled (which seems to make more sense to me?)
I guess any solution will make someone unhappy...
@vmoens I believe that when discussing PrioritizedSampler
, there is only one correct approach: we should not reduce the priorities of each trajectory while we are at it.
The core idea of PER is that certain samples (not trajectories) are important (such as a critical action) and need to be learned frequently. Reducing the priorities of each trajectory would make it difficult for PER to focus on updating specific important samples.
When discussing PrioritizedSliceSampler
, we face the choice of whether to reduce the priorities of each slice while we are at it. My suggestion is to leave this choice to the user, as the calculation of priorities and the calling of update_priority() are both handled by the user. In other words, we still should not reduce the priorities of each slice.
I think your thoughts are more likely associated with an "episodic buffer", but in my view, the current implementation of ReplayBuffer is not episodic, so there is no need to unify the priority of the entire trajectory.
The core idea of PER is that certain samples (not trajectories) are important (such as a critical action) and need to be learned frequently. Reducing the priorities of each trajectory would make it difficult for PER to focus on updating specific important samples.
Got it thanks for that, indeed that's how I edited the docstring (users should be in charge of setting the proper priority). But when we say prioritized, slice sampler I can imagine someone imagining: I have a transition with high priority therefore there is a chance to find it anywhere (not just at the beginning) in my sample -- whereas now there is a higher chance to find it at the beginning of a slice than at the end
I wrote dedicated tests under
test_slice_sampler_prioritized
TODO: