baiyanquan / HolisticRCA

Artifacts accompanying HolisticRCA, a framework for failure root cause analysis in cloud-native systems from a holistic perspective.
0 stars 0 forks source link

HolisticRCA

Artifacts accompanying HolisticRCA, a framework for failure root cause analysis in cloud-native systems from a holistic perspective.

Requirements

Dependencies

cd ./code
pip install -r requirements.txt

Our Test Sandbox

Folder Structure

Note: we have split the zip file for uploading. The following commands need to be input for the working folder:

zip model_split.zip -s=0 --out model.zip
unzip model.zip

Then the temp data also needs to be downloaded from https://github.com/baiyanquan/HolisticRCATempData.

Finally the model and temp_data folders need be placed following the structure of the working folder:

.
├── README.md
├── code                                          
│   ├── data_filter                             preprocess data
│   │   ├── CCF_AIOps_challenge_2022            preprocess dataset A
│   │   ├── ICASSP_AIOps_challenge_2022         preprocess dataset B
│   │   └── Eadro_TT_and_SN                     preprocess dataset C
│   ├── HolisticRCA                             the main model of the work
│   │   ├── ablation                            models for ablation study
│   │   ├── base                                base classes for model construction
│   │   ├── config                              configuration of file paths
│   │   ├── data_loader                         load dataset
│   │   ├── dataset                             base class for dataset reader
│   │   ├── explain                             mask learning component (for resource entity localization and fault-related observability data localization)
│   │   ├── model                               main components except mask learning
│   │   ├── trainer                             perform model training
│   │   └── util                                RERG class and data transformation
│   ├── shared_util                             some basic util functions
│   ├── experiments_a.sh                        quick experiments for dataset A
│   ├── experiments_b.sh                        quick experiments for dataset B
│   ├── experiments_c.sh                        quick experiments for dataset C
│   └── requirements.txt
├── model                                       saved model data for reproduction
└── temp_data                                   saved temp data for reproduction

Quick Start / Reproducibility

Prerequisites

  1. Prepare the Python packages in requirements.txt.
  2. Unzip model.zip and temp_data.zip.

Simple Result Checking

The saved model files are placed in model.zip. Following the files experiments_a.sh, experiments_b.sh, or experiments_c.sh and comment out rca_data_trainer.train() in the corresponding training files. It will output the evaluation results (note that some file paths need to be changed).

Running

cd ./code
bash experiments_a.sh
bash experiments_b.sh
bash experiments_c.sh

Raw Data

Since the raw data is too big, we list their links here, help for downloading: