Chrispresso / SuperMarioBros-AI

355 stars 72 forks source link

If you want to see a YouTube video describing this at a high level, and showcasing what was learned, take a look here.
If you want to see my blog explaining how all of this works in great detail, go here.

Update: The AI has successfully completed 1-1, 2-1, 3-1, 4-1, 5-1, 6-1, and 7-1. It was also able to learn: flagpole glitch with an enemy, walljump, and a fast acceleration.

This contains information on the following:

Installation Instructions

You will need Python 3.6 or newer.

  1. cd /path/to/SuperMarioBros-AI
  2. Run pip install -r requirements.txt
  3. Install the ROM
  4. Unzip Super Mario Bros. (World).zip to some location
  5. Run python -m retro.import "/path/to/unzipped/super mario bros. (world)"
    • Make sure you run this on the folder, i.e. python -m retro.import "c:\Users\chris\Downloads\Super Mario Bros. (World)"
    • You should see output text:
      Importing SuperMarioBros-Nes
      Imported 1 games

      Command Line Options

      If you want to see all command line options, the easiest way is to run python smb_ai.py -h.

Config

Loading Individuals

You may want to load individuals. This could be due to your computer crashing and you wanting to load part of an old population. You may want to run experiments by combining individuals from different populations. Whatever the case, you can do so by specify both of the below arguments:

Note that loading individuals only supports loading the best performing one from generations.

Replaying Individuals

This is helpful if you want to watch particular individuals replay their run. You must specify both of the below arguments:

Disable Displaying

You are unfortunately limited by the refresh rate of your monitor for certain things in PyQt. Because of this, when the display is open (whether it's hidden or not) you can only run at the refresh rate of your monitor. The emulator supports faster updates and because of that an option has been created to run this through only command line. This can help speed up training.

Debug

If you wish to know when populations are improving or when individuals have won, you can set a debug flag. This is helpul if you have disabled the display but wish to know how your population is doing.

Running Examples

I have several folders for examples. If you want to run any them to see how they perform, do:

Creating a New Population

If you want to create a new population, it's pretty easy. Make sure that if you are using the default settings.config that you change save_best_individual_from_generation and save_population_stats to reflect where you want that information saved. Once you have the config file how you want, simply run python smb_ai.py -c settings.config with any additional command line options.

Understanding the Config File

The config file is what controls the initialization, graphics, save locations, genetic algorithm parameters and more. It's important to know what the options are and what you can change them to.

Neural Network

Specified by [NeuralNetwork]

Graphics

Specified by [Graphics].

Statistics

Specified by [Statistics].

Genetic Algorithm

Specified by [GeneticAlgorithm].

Mutation

Specified by [Mutation].

Crossover

Specified by [Crossover].

Selection

Specified by [Selection].

Misc

Specified by [Misc].

Viewing Statistics

The .csv file contains information on the mean, median, std, min, max for frames, distance, fitness, wins. If you want to view the max distance for a .csv you could do:

stats = load_stats('/path/to/stats.csv')
stats['distance']['max']

Here is an example on how to plot stats using matplotlib.pyplot:

from mario import load_stats
import matplotlib.pyplot as plt

stats = load_stats('/path/to/stats.csv')
tracker = 'distance'
stat_type = 'max'
values = stats[tracker][stat_type]

plt.plot(range(len(values)), values)
ylabel = f'{stat_type.capitalize()} {tracker.capitalize()}' 
plt.title(f'{ylabel} vs. Generation')
plt.ylabel(ylabel)
plt.xlabel('Generation')
plt.show()

Results

Different populations of Mario learned in different way and for different environments. Here are some of the things the AI was able to learn:

Mario beating 1-1: Mario Beating 1-1

Mario beating 4-1: Mario Beating 4-1

Mario learning to walljump: Mario Learning to Walljump