This is a framework for testing the performance of Game Description Language (GDL) interpreters and reasoners used in General Game Playing. It allows for automatically running tests on a wide variety of reasoners across a wide variety of games, with minimal human intervention. It also supplies tools for analyzing the outputs of these tests.
This was inspired by an article by Yngvi Björnsson and Stephan Schiffel in the GIGA '13 proceedings that tested the performance of various such interpreters. It is designed so that tools from different programming languages can be tested. As a benefit of being open-source, programmers can add hooks for their own creations, or at least check that their code is being tested correctly. Having the results in a standardized format also means that analyses and visualizations can be created once, committed to the code base, and regenerated for future sets of results.
All games available on ggp.org are used in testing. Game IDs include the originating repository, the key within the repository, and a hash of the game contents, so no two versions of a game will be conflated.
There is also some preliminary work on a tool for testing the correctness of reasoners, which the performance-testing framework does not check.
Currently, before using the framework, you must provide a ggp-base.jar file. Check out a recent version of the ggp-base repository, run "./gradlew assemble" in the ggp-base folder, and copy ggp-base/build/libs/ggp-base.jar into gdl-perf/lib. (In the future this can be converted into a different kind of dependency.)
Before running tests, you must create a file named "computerName.txt" in the root of the repository. This should contain a name to indicate your computer; a recommended format is your GitHub username followed by a numeral identifying the specific computer. This makes it possible to commit results from your hardware without conflicting with other users' results. (This in turn makes it possible to commit raw results of testing without making the engine public, which some GGP competitors would like to avoid.)
Some engines require additional setup or configuration that is specific to your system (e.g. installing libraries or specifying program locations). These are explained in the SETUP.md file.
The following commands can be run via the Gradle wrapper script (replace 'gradlew' with 'gradlew.bat' if running on Windows):
./gradlew eclipse | Generate files needed to import this project in the Eclipse IDE.
./gradlew idea | Generate files needed to import this project in the IntelliJ IDE.
./gradlew runPerfSample | Run a sample selection of perf tests. (Takes under an hour.)
./gradlew runPerfAll | Run perf tests for all untested game/engine pairs. (Takes multiple days, but progress is recorded as it goes.)
./gradlew perfAnalysis | Generate a set of web pages containing an analysis of all perf tests that have been run. This is placed in the "analyses" folder.
Users are encouraged to add their own engines. There are two approaches to adding an engine, one specific for Java-based engines and one that will work with any language.
Java engines can be added as an entry in the JavaEngineType class, which can then be used to construct the corresponding EngineType entry. Once an EngineType entry is available, the engine will be automatically tested by the MissingEntriesPerfTestRunner (a.k.a. runPerfAll).
The first step to accomplishing this is to add your code as a dependency of the gdl-perf project, so you can write . This can be done in a few ways. From easiest-to-understand to most complete:
Now that you can reference your engine in the gdl-perf code, you'll need to create three things: a JavaSimulatorWrapper implementation for your engine, a JavaEngineType entry, and an EngineType entry.
Make a Runnables class that contains a method returning a JavaSimulatorWrapper backed by your engine. Select types for the Engine, State, Role, and Move generics that suit your engine. This should expose all the functionality of your engine in a uniform way. See StateMachineRunnables, GameSimulatorRunnables, JavaSimulatorRunnables, and similarly-named classes for examples.
Then, add an entry to the JavaEngineType enum that has as its arguments a version string to identify the version of the enum and an instance of the wrapper.
Finally, add an EngineType entry with the JavaEngineType as an argument.
At this point, you can start testing your engine. Try adding it to the list of engine types in the SamplePerfTestRunner as a starting point.
To test an engine in a non-Java language, you'll need to write an executable or script that adheres to the rules described in the "Performance testing" section below. Generally, it's best to put that script in the code repository for the engine being tested, not gdl-perf itself.
Within the repository, you'll need to create an EngineEnvironment and an EngineType entry.
Once the EngineType entry has been added, you can start testing your engine. Try adding it to the list of engine types in the SamplePerfTestRunner as a starting point.
Performance tests are run in their own process, one per game. Interactions with the framework use the command line and files written in a standard format. This has two advantages:
The following command line arguments are given to the test process:
The command line to run is specified in the EngineType enum entry for the engine. Note that additional arguments may be used in the command line; these will precede the other arguments. For example, the implementation for Java-based engines (PerfTestProcess) takes an initial argument specifying the ID of the engine to be tested.
The test process should run the performance test by repeatedly getting the initial state; checking if the state is terminal; if not, getting the available legal moves, picking random moves for each player, and using them to get the next state; repeating this process until a terminal state is reached; and computing the goal values for each player in this state. (The initial state may optionally be generated only once.) The Java implementation of this is in JavaPerfTestRunnable.
Results are expected to be written by the test process in the following format:
The variable types used in the results for a successful performance test are as follows:
An unsuccessful performance test may leave these variables to indicate the nature of the error:
Each completed perf test appends a line to a file within the results directory. Each line is a series of entries delimited by semicolons.
The format of results is not yet documented.
This is not yet documented.