An SSBM player based on Deep Reinforcement Learning.
NOTE: This project is no longer active and is subject to bit-rot. There is a successor project based on imitation learning from slippi replays at https://github.com/vladfi1/slippi-ai.
Tested on: Ubuntu >=14.04, OSX, Windows 7/8/10.
pip install -e .
.agents
directory. The full set of trained agents is available here.You will need to know where dolphin is located. On Mac the dolphin path will be /Applications/Dolphin.app/Contents/MacOS/Dolphin
. If dolphin-emu
is already on your PATH
then you can omit this.
python3 phillip/run.py --gui --human --start 0 --reload 0 --epsilon 0 --load agents/FalconFalconBF --iso /path/to/SSBM.iso --exe /path/to/dolphin [--windows]
Trained agents are stored in the agents
directory. Aside from FalconFalconBF
, the agents in agents/delay0/
are also fairly strong. Run with --help
to see all options. The best human-like agent is delay18/FalcoBF
, available in the Google Drive zip.
--exe
will be the path to the Binary\x64\Dolphin.exe
you unzipped. In general, the forward /
s should be back \
s for all paths, unless you are using MinGW, Cygwin, git bash, or some other unix shell emulator.3
from commands like python3
and pip3
.--tcp 1
flag (now implied by --windows
). You may also need to open port 5555 in your firewall.--user tmp
(the temp directories that python creates start with /tmp/...
and aren't valid for windows dolphin).Training is controlled by phillip/train.py
. See also runner.py
and launcher.py
for training massively in parallel on slurm clusters. Phillip has been trained at the MGHPCC. It is recommended to train with a custom dolphin which uses zmq to synchronize with the AI - the below commands will likely fail otherwise.
Local training is also possible. First, edit runner.py
with your desired training params (advanced). Then do:
python3 runner.py # will output a path
python3 launcher.py saves/path/ --init --local [--agents number_of_agents] [--log_agents]
To view stats during training:
tensorboard --logdir logs/
The trainer and (optionally) agents redirect their stdout/err to slurm_logs/
. To end training:
kill $(cat saves/path/pids)
To resume training run launcher.py
again, but omit the --init
(it will overwrite your old network).
Training on Windows is not supported.
Thanks to microsoftv there is now an instructional video as well!
Come to the Discord!
I've been streaming practice play over at http://twitch.tv/x_pilot. There are also some recordings on my youtube channel.
Big thanks to altf4 for getting me started, and to spxtr for a python memory watcher. Some code for dolphin interaction has been borrowed from both projects (mostly the latter now that I've switched to pure python).