M-Nauta / TCDF

Temporal Causal Discovery Framework (PyTorch): discovering causal relationships between time series
GNU General Public License v3.0
482 stars 106 forks source link

Are the delays temporal? #12

Closed svikramank closed 3 years ago

svikramank commented 3 years ago

Hi, really interesting work! I was trying to understand the delay/weights assigned to the edges in the causal network inferred by TCN. My question is the following-

  1. Given the explanation of assigning delays, it looks like delay/weights are temporal in nature. Meaning, the delay is extracted w.r.t time, i.e say X_2 is effect and X_1 is cause. Then say at t=16, you want to infer what is the causal graph and the delay. So you pick X_2|t=16 and look at the learned weights of TCN and backtrack to see which timestamp in X_1 you end up at following the largest kernel weights. Say you end up with X_1|t=14. Then you conclude that-- At t=16, X_1 causes X_2 with a delay of 2.

Is my understanding correct? If no, can you please explain to me where I am wrong?

If yes, then in the ground_truth dataset, I see the labels where there are only 3 columns- (source, target, delay) for each dataset. Thus, there is no temporal notion in the labels. For eg: it just says, X_2 cause X3 with delay of 1. But when does that delay occur? At what timestamp was the causal graph created? Say the length of time series are 1000. So is the graph built at t=1000? if yes then is the delay calculated w.r.t X_3|t=1000? Meaning, the delay between cause and effect is calculated at the end time step?

Like in the example below, what i understand is that the delay is calculated at t=16. Meaning, at t=16, the delay between the cause and effect is 6 timesteps. And this will change as we check the delay at t=15, or t=17.

image

M-Nauta commented 3 years ago

We use the term "temporal" to indicate that TCDF learns a causal graph with discovered causal relationships between time series. The causal graph, including delays, applies to the whole dataset and we therefore only need 1 ground-truth delay. So although the term "temporal" might be a bit confusing here, the delay indicates the time delay between one causal time series and one effect over all time steps in the data. This delay is learned by looping over the dataset multiple times. Of course it would be a nice improvement to learn dynamic graphs that can change over time, including delays w.r.t. time, but this is currently not supported.

svikramank commented 3 years ago

Understood. Thanks. But then in the figure above, why is delay calculated at 16th timestamp, i.e w.r.t X_2|t=16. Does that mean after predicting X_2 for all time steps (i.e 16 in this case), we always look at the last timestamp and backtrack our way to the timestamp with highest kernel weights in the X_1 time series?

Or do we assume that lag calculated at t=16 should be same as t=15 or earlier. Meaning, when we say the time series X_1 causes X_2 with a lag of 6 time steps, we mean the lag is consistent throughout time for both time series?

M-Nauta commented 3 years ago

The weights in a CNN are shared and learned by looping over the time series multiple times. So eventually the example weights in Figure 9 (values next to the coloured arrows) would be the same for the 16th, 17th and 99th timestamp. It's just a trained kernel that slides over the time series, as explained in Section 3.3. Since we use a three layer network in this example with a dilation factor of 2, the receptive field of this network is 16 (see Equation 2 in the paper). Therefore, t=16 is the first timestamp for which we can fully backtrack the weights and hence the calculation is shown for t=16. But the delay will be the same for other timestamps in this time series as the weights of the trained CNN stay the same. This indeed implies that we assume that the lag is consistent through the time series (giving opportunities for improvement).

svikramank commented 3 years ago

Great, thanks! This answers my question.