pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.27k stars 302 forks source link

[BugFix] Reinitialize vmap callers after reset of vmap randomness #2314

Closed vmoens closed 2 months ago

vmoens commented 2 months ago

Description

Describe your changes in detail.

Motivation and Context

Why is this change required? What problem does it solve? If it fixes an open issue, please link to the issue here. You can use the syntax close #15213 if this solves the issue #15213

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

pytorch-bot[bot] commented 2 months ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2314

Note: Links to docs will display an error until the docs builds have been completed.

:x: 11 New Failures, 1 Cancelled Job, 1 Pending, 5 Unrelated Failures

As of commit c7947fad602edda69318f653fe2ed61b241a0f8e with merge base f840a1a4364bbb0bd33fbff7c4554e75af3ee1db (image):

NEW FAILURES - The following jobs have failed:

* [Continuous Benchmark (PR) / CPU Pytest benchmark](https://hud.pytorch.org/pr/pytorch/rl/2314#27843453764) ([gh](https://github.com/pytorch/rl/actions/runs/10072117562/job/27843453764)) `Process completed with exit code 1.` * [Continuous Benchmark (PR) / GPU Pytest benchmark](https://hud.pytorch.org/pr/pytorch/rl/2314#27843454398) ([gh](https://github.com/pytorch/rl/actions/runs/10072117562/job/27843454398)) `Process completed with exit code 1.` * [Generate documentation / build-docs (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843456008) ([gh](https://github.com/pytorch/rl/actions/runs/10072117563/job/27843456008)) `No files were found with the provided path: /home/ec2-user/actions-runner/_work/_temp/artifacts/. No artifacts will be uploaded.` * [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843455687) ([gh](https://github.com/pytorch/rl/actions/runs/10072117574/job/27843455687)) `RuntimeError: Command docker exec -t 859512d1b71b2a3539e5a55bf3c9f3056eb994c2b9341f6cd1a8eb065452fcf9 /exec failed with exit code 139` * [Lint / python-source-and-configs / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843455496) ([gh](https://github.com/pytorch/rl/actions/runs/10072117575/job/27843455496)) `RuntimeError: Command docker exec -t e5fd67037564cced340002ce5c85ee0b23bd8c2e325dcf865f8c24bdd3b4e317 /exec failed with exit code 1` * [Unit-tests on Linux / tests-cpu (3.10) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843456119) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843456119)) `test/test_cost.py::TestCQL::test_cql_reduction[sum]` * [Unit-tests on Linux / tests-cpu (3.11) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843456736) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843456736)) `test/test_cost.py::TestCQL::test_cql_reduction[sum]` * [Unit-tests on Linux / tests-cpu (3.8) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843457298) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843457298)) `test/test_cost.py::TestCQL::test_cql_reduction[sum]` * [Unit-tests on Linux / tests-cpu (3.9) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843458617) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843458617)) `test/test_cost.py::TestCQL::test_cql_reduction[sum]` * [Unit-tests on Linux / tests-gpu (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843457961) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843457961)) `test/test_cost.py::TestCQL::test_cql_reduction[sum]` * [Unit-tests on Windows / unittests-cpu / windows-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843454379) ([gh](https://github.com/pytorch/rl/actions/runs/10072117581/job/27843454379)) `The process 'C:\Program Files\Git\cmd\git.exe' failed with exit code 128`

CANCELLED JOB - The following job was cancelled. Please retry:

* [Examples Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843455601) ([gh](https://github.com/pytorch/rl/actions/runs/10072117566/job/27843455601)) `##[error]The operation was canceled.`

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

* [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843462231) ([gh](https://github.com/pytorch/rl/actions/runs/10072117570/job/27843462231)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Libs Tests on Linux / unittests-sklearn (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843462507) ([gh](https://github.com/pytorch/rl/actions/runs/10072117570/job/27843462507)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843459024) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843459024)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843459449) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843459449)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128` * [Unit-tests on Linux / tests-stable-gpu (3.10, 11.8) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2314#27843459931) ([gh](https://github.com/pytorch/rl/actions/runs/10072117567/job/27843459931)) (matched **linux** rule in [flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json)) `The process '/usr/bin/git' failed with exit code 128`

This comment was automatically generated by Dr. CI and updates every 15 minutes.