ganyariya / gym-md

MiniDungeons for OpenAI Gym
MIT License
3 stars 2 forks source link
gym-environment python

gym-md

日本語のREADME.md

The original Japanese README can be found here.

Contents

Overview

gym-md is a python reimplementation[^1] of the dungeon exploration game MiniDungeons[^2] created as an OpenAI Gym[^3] environment. MiniDungeons[^2] is a roguelike dungeon exploration game created as a benchmark research domain for modeling decision-making styles of human players[^4]. A Java implementation of MiniDungeons can be found here.

[^1]: Y. Iwasaki. and K. Hasebe., “A framework for generating playstyles of game ai with clustering of play logs,” in Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART,, INSTICC. SciTePress, 2022, pp. 605–612.

[^2]: C. Holmgård, A. Liapis, J. Togelius, and G. N. Yannakakis, “Evolving personas for player decision modeling,” in 2014 IEEE Conference on Computational Intelligence and Games, 2014, pp. 1–8.

[^3]: G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.

[^4]: A. Liapis, “Minidungeons”, website, 2022. Accessed on: Mar. 27, 2022. [Online]. Available: http://antoniosliapis.com/projects/project_minidungeons.php

Installation

Installing from PyPI

The gym-md python package can be found on pypi. To install the latest gym-md package run:

pip install gym-md

Running build and tests

Prerequisites

The gym-md project makes use of pipenv for the overall project's package management. In order to build the project's documentation and run the respective tests pipenv will need to be installed. Please see the 'Installation' section on the pipenv PyPI page. If you face any issues with the pipenv installation, you can also try installing pipenv using pip (see source).

Furthermore, several additional tests and code linting is orchestrated using tox, defined in the tox.ini file. Please see tox installation for more detail.

Running the build and tests

If you would like to build and install gym-md from source, please run the following commands:

git clone https://github.com/Ganariya/gym-md.git
cd gym-md

# create the pipenv gym-md build and testing environment
pipenv install

# launch the pipenv environment
pipenv shell

# build gym-md documentation
pipenv run build

# run gym-md tests
pipenv run test

# start the tox testing orchestration
tox

# to build and upload your own gym-md wheel (.whl) file, please see the upload.sh file.
# your custom .whl can be locally installed using: pip install <path to .whl>
rm -f -r gym_md.egg-info/* dist/*
python setup.py bdist_wheel
twine upload dist/*

Usage

import gym
import gym_md
import random

env = gym.make('md-test-v0')

LOOP: int = 100
TRY_OUT: int = 100

for _ in range(TRY_OUT):
    observation = env.reset()
    reward_sum = 0
    for i in range(LOOP):
        env.render(mode='human')
        actions = [random.random() for _ in range(7)]
        observation, reward, done, info = env.step(actions)

        reward_sum += reward

        if done:
            env.render()
            break

    print(reward_sum)

The MiniDungeons Gym Environment

Overview

Click here for a Getting Started with Gym overview.

Actions

An action within the gym-md environment (env) is represented as a python list containing seven (7) floating point values, for example:

actions_eg = [0.7603953105618472,
              0.954037518265538,
              0.7224447519623062,
              0.35121023208759905,
              0.4878166326111911,
              0.6166020008598004,
              0.48734265188517545]

Each index in the actions list corresponds to a specific action available for the game agent to take:

The environment (env) selects the action, within the action float list, which has the highest value.

In the actions_eg list example, the action with the highest value is 'Head to the treasure' (which is index 1, with a value of 0.954037518265538). However, if the selected highest action is not a valid action within the given state, then the next highest action value is taken.

In the actions_eg list example, if the original highest action 'Head to the treasure' (index 1, with a value of 0.954037518265538) cannot be performed (e.g. there is no more treasure to collect) then the next highest action 'Head to the monster' (which is index 0, with a value of 0.7603953105618472) is chosen. This action selection process is repeated until a valid action can be performed within the given state. Furthermore, if the desired values are the same, an action is randomly selected.

Environment

env object

The env object created using env = gym.make('md-test-v0') is based on the MdEnvBase class defined within md_env.py. The env object contains several objects and methods (only a subset is discussed here, please see md_env.py for more). The env object contains:

The OpenAI Gym environment specific methods are discussed as part of the env.step method subsection.

env.step method

The gym-md environment's step method returns the following values:

observation, reward, done, info = env.step(actions)

Levels and Settings

New levels can be created by creating your own class which inherits from the MdEnvBase class. The below example, along with others, can be found within the md_env_list.py script.

from typing import Final

from gym_md.envs.md_env import MdEnvBase

class TestMdEnv(MdEnvBase):
    """TestMdEnv Class."""

    def __init__(self):
        stage_name: Final[str] = "test"
        super(TestMdEnv, self).__init__(stage_name=stage_name)

You will also need to set the respective level's json file and txt file, found respectively within the props and stages folders. Furthermore, you will then need to add the additional levels to the gymmd.__init_\.py and gymmd.envs.__init_\.py files.

For a list of the current available gym_md environment levels, please see Stages README.md. For a simpler level list view please see env_levels.txt. Both the Stages README.md and _envlevels.txt files contains the env levels registered within the gymmd.__init_\.py.