Closed kurtamohler closed 1 month ago
Note: Links to docs will display an error until the docs builds have been completed.
As of commit 1c90d04e66be04668beeaa2e7415b3643fea0fb9 with merge base e82a69f5af94cc936c4b872fd2ed499ed33b4f8e ():
* [Habitat Tests on Linux / tests (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2345#29020584169) ([gh](https://github.com/pytorch/rl/actions/runs/10478069966/job/29020584169)) `RuntimeError: Command docker exec -t 0ebcf6625dffb13eaf98fed4e3b81ab80ddd590a16eefa21b00697c196510f58 /exec failed with exit code 139` * [Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2345#29573828612) ([gh](https://github.com/pytorch/rl/actions/runs/10478069968/job/29573828612)) `RuntimeError: Command docker exec -t fb62162595af9a039c6fae565f9fb1c8e44e791df21e76ed1e7852856acd0fcc /exec failed with exit code 1` * [Libs Tests on Linux / unittests-robohive (3.9, 12.1) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2345#29573829721) ([gh](https://github.com/pytorch/rl/actions/runs/10478069968/job/29573829721)) `test/test_libs.py::TestRoboHive::test_robohive[franka_slide_random-v3-True-True]`
* [Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda11_8](https://hud.pytorch.org/pr/pytorch/rl/2345#29034910058) ([gh](https://github.com/pytorch/rl/actions/runs/10478070001/job/29034910058)) (detected as infra flaky with no log or failing log classifier) * [Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda12_1](https://hud.pytorch.org/pr/pytorch/rl/2345#29034910206) ([gh](https://github.com/pytorch/rl/actions/runs/10478070001/job/29034910206)) (detected as infra flaky with no log or failing log classifier) * [Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda12_4](https://hud.pytorch.org/pr/pytorch/rl/2345#29034910348) ([gh](https://github.com/pytorch/rl/actions/runs/10478070001/job/29034910348)) (detected as infra flaky with no log or failing log classifier) * [Continuous Benchmark (PR) / CPU Pytest benchmark](https://hud.pytorch.org/pr/pytorch/rl/2345#29020581867) ([gh](https://github.com/pytorch/rl/actions/runs/10478069953/job/29020581867)) (detected as infra flaky with no log or failing log classifier) * [Continuous Benchmark (PR) / GPU Pytest benchmark](https://hud.pytorch.org/pr/pytorch/rl/2345#29020582558) ([gh](https://github.com/pytorch/rl/actions/runs/10478069953/job/29020582558)) (detected as infra flaky with no log or failing log classifier)
👉 Rebase onto the `viable/strict` branch to avoid these failures
* [Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cpu](https://hud.pytorch.org/pr/pytorch/rl/2345#29034909895) ([gh](https://github.com/pytorch/rl/actions/runs/10478070001/job/29034909895)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/e82a69f5af94cc936c4b872fd2ed499ed33b4f8e#28752759909)) * [Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job](https://hud.pytorch.org/pr/pytorch/rl/2345#29020588518) ([gh](https://github.com/pytorch/rl/actions/runs/10478069961/job/29020588518)) ([trunk failure](https://hud.pytorch.org/pytorch/rl/commit/e82a69f5af94cc936c4b872fd2ed499ed33b4f8e#28740460021)) `test/test_transforms.py::TestKLRewardTransform::test_kl_lstm`
This comment was automatically generated by Dr. CI and updates every 15 minutes.
A few notes:
categorical_action_encoding=False
is not supported yet.
Also, some of the games in OpenSpiel do not work properly in OpenSpielWrapper
because the action spec assumes a discrete space of pyspiel.Game.num_distinct_actions()
(see OpenSpiel API reference). However, for some of the games, pyspiel.State.legal_actions()
can return more actions than pyspiel.Game.num_distinct_actions()
. I suppose to support those games we need to allow the action spec's size to change at each step?
At the moment, this only supports games where all actions are taken by the players, like in chess or tic-tac-toe. But I've realized that OpenSpiel also has a concept of chance nodes, where a random non-player action is taken. For instance, in Kuhn poker, the initial dealing of the cards is a chance node. In liar's dice, the outcome of rolling dice is a chance node. OpenSpiel has some methods to obtain all the possible chance actions and the associated probability distribution (shown in this example).
For now, I'll raise an error if a loaded game contains chance nodes and leave it as a TODO. I might wait to add support for it in a follow-up PR--unless you would prefer for me to add it in this PR.
I suppose to support those games we need to allow the action spec's size to change at each step?
Yes I think having a dynamic space would be the way to go. See #2143
OpenSpiel has some methods to obtain all the possible chance actions and the associated probability distribution (shown in this example).
That's an amazing feature. Happy to integrate it separately!
I suppose to support those games we need to allow the action spec's size to change at each step?
Yes I think having a dynamic space would be the way to go. See #2143
Actually, I'm not so sure that we would need dynamic action specs after all. It is true that pyspiel.State.legal_actions()
can return a different number of actions than pyspiel.Game.num_distinct_actions()
, but I'm pretty sure (not 100% sure) that only happens when it's a chance node, in which case the length of legal_actions()
is pyspiel.Game.max_chance_outcomes()
instead. Once I add support for chance nodes, it will have its own action spec separate from the players' action specs, and I think all action specs will maintain the same shape throughout the game.
I think I've addressed everything that needed to be fixed so far. Let me know if there is anything else
Description
Adds environment wrapper classes for OpenSpiel.
OpenSpielWrapper.reset
supports resetting to a specified state.Motivation and Context
Part of #2133
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!