Reproduce Figure 1 on WWT

fjxmlzn / DoppelGANger

[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions

BSD 3-Clause Clear License

296 stars 75 forks source link

Hi, thank you for your work and ultra-clean code. I appreciate it a lot. I'm trying to reproduce Figure 1 in the paper and ran into some problems. I've attached the image I got below: acf

My first concern is that the ACF for real WWT data (either train or test) doesn't match Figure 1 in the paper. This data is directly downloaded from the GDrive. I'm currently using statsmodels.tsa.stattools.acf and calculate the ACF for 550 nlags averaged across 50,000 samples. Is this correct? I would appreciate pointers for how you calculated the ACF for real data to get Figure 1.

My second concern is with the generated time series for DoppelGANger. I've used the provided codebase with sample_len = 10 but as you can see in the ACF, it doesn't quite capture the weekly seasonality as clearly as Figure 1. Maybe this is related to how ACF was calculated? The MSE between the DoppelGANger and real is 0.0005 for my case, which is close to 0.0009 in Table 3 so I'm not sure what went wrong.

For quick reference: Screen Shot 2021-06-09 at 5 21 46 PM

import torch EPS = 1e-8 def autocorr(X, Y): Xm = torch.mean(X, 1).unsqueeze(1) Ym = torch.mean(Y, 1).unsqueeze(1) r_num = torch.sum((X - Xm) * (Y - Ym), 1) r_den = torch.sqrt(torch.sum((X - Xm)**2, 1) * torch.sum((Y - Ym)**2, 1)) r_num[r_num == 0] = EPS r_den[r_den == 0] = EPS r = r_num / r_den r[r > 1] = 0 r[r < -1] = 0 return r def get_autocorr(feature): feature = torch.from_numpy(feature) feature_length = feature.shape[1] autocorr_vec = torch.Tensor(feature_length - 2) for j in range(1, feature_length - 1): autocorr_vec[j - 1] = torch.mean(autocorr(feature[:, :-j], feature[:, j:])) return autocorr_vec

fjxmlzn / DoppelGANger

Reproduce Figure 1 on WWT #20