AlignmentResearch / go_attack

MIT License
81 stars 7 forks source link

Go Attack

This repository contains code for studying the adversarial robustness of KataGo.

Read about our research here: https://arxiv.org/abs/2211.00241.

View our website here: https://goattack.far.ai/.

To run our adversary with Sabaki, see this guide.

Development / testing information

To clone this repository, run one of the following commands

# Via HTTPS
git clone --recurse-submodules https://github.com/AlignmentResearch/go_attack.git

# Via SSH
git clone --recurse-submodules git@github.com:AlignmentResearch/go_attack.git

You can run pip install -e .[dev] inside the project root directory to install all necessary dependencies.

To run a pre-commit script before each commit, run pre-commit install (pre-commit should already have been installed in the previous step). You may also want to run pre-commit install from engines/KataGo-custom to install that repository's respective commit hook.

Git submodules

Modifications to KataGo are not tracked in this repository and should instead be made to the AlignmentResearch/KataGo-custom repository. We use code from KataGo-custom in this repository via a Git submodule.

Individual containers

We run KataGo within Docker containers. More specifically:

  1. The C++ portion of KataGo runs in the container defined by compose/cpp/Dockerfile.
  2. The Python training portion of KataGo runs in the container defined at compose/python/Dockerfile.

The Dockerfiles contain instructions for how to build them.

After building a container, you run it with a command like

docker run --gpus all -v ~/go_attack:/go_attack -v DATA_DIR:/shared -it humancompatibleai/goattack:cpp

where DATA_DIR is a directory, shared among all containers, in which to save the results of training runs.

A KataGo executable can be found in the /engines/KataGo-custom/cpp directory inside the C++ container.

Launching victim-play training runs

In order to launch training runs, run several containers simultaneously:

See configs/examples for example experiment configurations and example values for the warmstart flags.

For these wrapper scripts in kubernetes/, optional flags for the wrapper come before any positional arguments, but optional flags for the underlying command the wrapper calls go after any positional arguments. For example, in the command /go_attack/kubernetes/shuffle-and-export.sh --preseed WARMSTART-SELFPLAY-DIR EXPERIMENT-NAME /shared -add-to-window 100000000, --preseed is a flag for the wrapper whereas -add-to-window is a flag to be passed to /engines/KataGo-tensorflow/python/selfplay/shuffle_and_export_loop.sh.

Docker compose

Within the compose directory of this repo are a few docker-compose .yml files that automate the process of spinning up the various components of training.

Each .yml file also has a corresponding .env that configures more specific parameters of the run ( e.g. what directory to write to, how many threads to use, batch size, where to look for other config files ).

(Note: we stopped using these in October 2022, so they are no longer maintained.)

Website and analysis notebooks

See AlignmentResearch/KataGoVisualizer.

Baseline attacks

In addition to the learned attacks, we also implement 5 baseline, hardcoded attacks:

You can test these attacks by running baseline_attacks.py with the appropriate --strategy flag (edge, random, pass, spiral, or mirror). Run python scripts/baseline_attacks.py --help for more information about all the available flags.