fjxmlzn / DoppelGANger

[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
http://arxiv.org/abs/1909.13403
BSD 3-Clause Clear License
296 stars 75 forks source link

Reproduce Figure 1 on WWT #20

Closed CubicQubit closed 3 years ago

CubicQubit commented 3 years ago

Hi, thank you for your work and ultra-clean code. I appreciate it a lot. I'm trying to reproduce Figure 1 in the paper and ran into some problems. I've attached the image I got below: acf

My first concern is that the ACF for real WWT data (either train or test) doesn't match Figure 1 in the paper. This data is directly downloaded from the GDrive. I'm currently using statsmodels.tsa.stattools.acf and calculate the ACF for 550 nlags averaged across 50,000 samples. Is this correct? I would appreciate pointers for how you calculated the ACF for real data to get Figure 1.

My second concern is with the generated time series for DoppelGANger. I've used the provided codebase with sample_len = 10 but as you can see in the ACF, it doesn't quite capture the weekly seasonality as clearly as Figure 1. Maybe this is related to how ACF was calculated? The MSE between the DoppelGANger and real is 0.0005 for my case, which is close to 0.0009 in Table 3 so I'm not sure what went wrong.

For quick reference: Screen Shot 2021-06-09 at 5 21 46 PM

fjxmlzn commented 3 years ago

This was the code we used for computing autocorrelation for Figure 1.

import torch

EPS = 1e-8

def autocorr(X, Y):
    Xm = torch.mean(X, 1).unsqueeze(1)
    Ym = torch.mean(Y, 1).unsqueeze(1)
    r_num = torch.sum((X - Xm) * (Y - Ym), 1)
    r_den = torch.sqrt(torch.sum((X - Xm)**2, 1) * torch.sum((Y - Ym)**2, 1))

    r_num[r_num == 0] = EPS
    r_den[r_den == 0] = EPS

    r = r_num / r_den
    r[r > 1] = 0
    r[r < -1] = 0

    return r

def get_autocorr(feature):
    feature = torch.from_numpy(feature)
    feature_length = feature.shape[1]
    autocorr_vec = torch.Tensor(feature_length - 2)

    for j in range(1, feature_length - 1):
        autocorr_vec[j - 1] = torch.mean(autocorr(feature[:, :-j],
                                                  feature[:, j:]))

    return autocorr_vec

Let me know if you have any further questions.