NetManAIOps / CIRCA

Causal Inference-based Root Cause Analysis
BSD 3-Clause "New" or "Revised" License
66 stars 10 forks source link

Questions on Ground Truth Setting in Simulation Dataset #9

Closed Alexia-I closed 3 weeks ago

Alexia-I commented 1 month ago

Hi Mingjie,

I have a question regarding the ground truth setting in the simulation dataset used in your paper [KDD 22] Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition. It seems that the details on how the ground truth was established are not included in the paper or the documentation.

Additionally, I noticed an issue in the ground truth for simulation graph 0, case 0. Specifically, nodes (37, 39) are not the ancestor nodes of the anomaly nodes (15, 16, 24). If the ground truth nodes and the anomaly nodes are independent, how do you get the ground truth?

Could you provide more information on the methodology used to determine the ground truth? Any additional details or corrections would be greatly appreciated.

Thanks for your assistance!

Best, Alexia

limjcst commented 1 month ago

Thanks for your interest!

The ground truth (i.e., the fault to inject) is generated randomly in circa.experiment.simulation.generate_case. https://github.com/NetManAIOps/CIRCA/blob/0215e1880096aa02a305c697f1c23cac4600ebd2/circa/experiment/simulation/__init__.py#L274

Notice that anomaly detection is unnecessary to generate a simulation dataset. In contrast, we start with a random initial state, a data generation process (the Vector Auto-regression model), and a fault.

Alexia-I commented 4 weeks ago

Thank you for your response. I reviewed the _generatecase function and had a couple of observations. It seems the anomaly injection part does not include fault propagation, and the data generation for _lengthabnormal appears to be the same as _lengthnormal. I was wondering, what would be the idea of testing on simulation data if it is not in line with realistic data where fault propagation is of great importance? https://github.com/NetManAIOps/CIRCA/blob/0215e1880096aa02a305c697f1c23cac4600ebd2/circa/experiment/simulation/__init__.py#L297 https://github.com/NetManAIOps/CIRCA/blob/0215e1880096aa02a305c697f1c23cac4600ebd2/circa/experiment/simulation/__init__.py#L266

btw, I am not entirely sure about the statement:

Notice that anomaly detection is unnecessary to generate a simulation dataset.

Could you please elaborate more on this?

limjcst commented 4 weeks ago

It seems the anomaly injection part does not include fault propagation

It does include fault propagation.

Note that we adopt the Vector Auto-regression model as the underlying data generation process for the simulation datasets. Eq. (6) in our paper formulates the model, where A is the weighted adjacent matrix encoding the causal Bayesian network (CBN). Each variable's fault will propagate to other variables along the CBN. As we force the CBN to be a directed acyclic graph, we can calculate (I - A)^{-1} as weight, used in Line 266, 292, and 297. Meanwhile, \beta will propagate the effect along time.

Could you please elaborate more on this?

I took it for ground that we cannot analyze data (e.g., anomaly detection or root cause analysis) until the data are generated and collected. My point is that the fault exists before we start to analyze the faulty data, and thus is a component of the data generation process. Note that the data generation process with faults looks different from it used to be.

limjcst commented 4 weeks ago

Take the following simple model as an example.

x(t) = u(t), y(t) = x(t) + v(t), where u(t), v(t) ~ N(0, 1).

As we alter x(t) as x'(t) = 1 + u(t), we will find that the expectation of y(t) changes from 0 to 1. Notice that y(t) = x(t) + v(t) remains unchanged.

We have the following observations:

  1. such a simple model already takes fault propagation into consideration, and
  2. the data generation process remains the same except the intervened variables.
Alexia-I commented 3 weeks ago

Ah I see... Thank you for clarifying. Sorry for the confusion earlier, and thanks for your patience.