[!WARNING]
As of Feb 9, 2024, the pyRDDLGym API has been updated to version 2.0, and is no longer backwards compatible with the previous stable version 1.4.4. While we strongly recommend that you update to 2.0, in case you require the old API, you can install the last stable version with pip:pip install pyRDDLGym==1.4.4
, or directly from githubpip install git+https://github.com/pyrddlgym-project/pyRDDLGym@version_1.4.4_stable
.
A Python toolkit for auto-generation of OpenAI Gym environments from Relational Dynamic Influence Diagram Language (RDDL) description files.
This is currently the official parser, simulator and evaluation system for RDDL in Python, with new features and enhancements to the RDDL language.
We require Python 3.8+ and the following packages: ply
, pillow>=9.2.0
, numpy>=1.22
, matplotlib>=3.5.0
, gymnasium
, pygame
, termcolor
.
You can install our package, along with all of its prerequisites, using pip
pip install pyRDDLGym
Since pyRDDLGym does not come with any premade environments, you can either load RDDL documents from your local file system, or install rddlrepository for easy access to preexisting domains
pip install rddlrepository
Several example scripts are provided to illustrate basic pyRDDLGym usage:
To simulate an environment for example, from the install directory of pyRDDLGym, type the following into a shell supporting the python command (you need rddlrepository):
python -m pyRDDLGym.examples.run_gym "Cartpole_Continuous_gym" "0" 1
which loads instance "0" of the CartPole control problem with continuous actions from rddlrepository and simulates it with a random policy for one episode.
This section outlines some of the basic python API functions of pyRDDLGym in more detail.
Instantiation of an existing environment by name is as easy as:
import pyRDDLGym
env = pyRDDLGym.make("Cartpole_Continuous_gym", "0")
Loading your own domain files is just as straightforward
import pyRDDLGym
env = pyRDDLGym.make("/path/to/domain.rddl", "/path/to/instance.rddl")
Both versions above instantiate env
as an OpenAI gym environment, so that the usual reset()
and step()
calls work as intended.
You can also pass custom settings to the make command, i.e.:
import pyRDDLGym
env = pyRDDLGym.make("Cartpole_Continuous_gym", "0", enforce_action_constraints=True, ...)
You can design your own visualizer by subclassing from pyRDDLGym.core.visualizer.viz.BaseViz
and overriding the render(state)
method.
Then, changing the visualizer of the environment is easy
viz_class = ... # the class name of your custom viz
env.set_visualizer(viz_class)
You can record an animated gif or movie of the agent interaction with an environment (described below). To do this, simply pass a MovieGenerator
object to the set_visualizer
method:
from pyRDDLGym.core.visualizer.movie import MovieGenerator
movie_gen = MovieGenerator("/path/where/to/save", "env_name")
env.set_visualizer(viz_class, movie_gen=movie_gen)
Agents map states to actions through the sample_action(obs)
function, and can be used to interact with an environment.
For example, to initialize a random agent:
from pyRDDLGym.core.policy import RandomAgent
agent = RandomAgent(action_space=env.action_space, num_actions=env.max_allowed_actions)
All agent instances support one-line evaluation in a given environment:
stats = agent.evaluate(env, episodes=1, verbose=True, render=True)
which returns a dictionary of summary statistics (e.g. "mean", "std", etc...), and which also visualizes the domain in real time. Of course, if you wish, the standard OpenAI gym interaction is still available to you:
total_reward = 0
state, _ = env.reset()
for step in range(env.horizon):
env.render()
action = agent.sample_action(state)
next_state, reward, terminated, truncated, _ = env.step(action)
print(f'state = {state}, action = {action}, reward = {reward}')
total_reward += reward
state = next_state
done = terminated or truncated
if done:
break
print(f'episode ended with reward {total_reward}')
# release all viz resources, and finish logging if used
env.close()
[!NOTE]
All observations (for a POMDP), states (for an MDP) and actions are represented bydict
objects, whose keys correspond to the appropriate fluents as defined in the RDDL description. Here, the syntax ispvar-name___o1__o2...
, wherepvar-name
is the pvariable name, followed by 3 underscores, and object parameterso1
,o2
... are separated by 2 underscores.[!WARNING] There are two known issues not documented with RDDL:
- the minus (-) arithmetic operation must have spaces on both sides, otherwise there is ambiguity whether it refers to a mathematical operation or to variables
- aggregation-union-precedence parsing requires for encapsulating parentheses around aggregations, e.g., (sum_{}[]).
A complete archive of past and present RDDL problems, including all IPPC problems, is also available to clone\pip
pip install rddlrepository
)Software for related simulators:
The parser used in this project is based on the parser from Thiago Pbueno's pyrddl (used in rddlgym).
Please see our paper describing pyRDDLGym. If you found this useful, please consider citing us:
@article{taitler2022pyrddlgym,
title={pyRDDLGym: From RDDL to Gym Environments},
author={Taitler, Ayal and Gimelfarb, Michael and Gopalakrishnan, Sriram and Mladenov, Martin and Liu, Xiaotian and Sanner, Scott},
journal={arXiv preprint arXiv:2211.05939},
year={2022}}
This software is distributed under the MIT License.