fjxmlzn / DoppelGANger

[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
http://arxiv.org/abs/1909.13403
BSD 3-Clause Clear License
296 stars 75 forks source link

Time Series Length #4

Closed firmai closed 4 years ago

firmai commented 4 years ago

Is it possible to generate a continuous stream of time series information using this model? For example, I only have 10,000 samples of 500 time steps, can I generate one sample with 4000 steps (8 times the input length) with you package? And how is this achieved, is it done by feeding in the previous time step?

fjxmlzn commented 4 years ago

Yes, you can generate samples that are longer than the training samples, by changing the length parameter in https://github.com/fjxmlzn/DoppelGANger/blob/master/example_generating_data/gan_generate_data_task.py#L132 to whatever you want.

firmai commented 4 years ago

Thanks, I have been reading your paper, and I am trying to understand how this is done. In one sentence, is it just running multiple RNN's and MLP's that inlcude the previous generated sample in the new generation process.

"This unrolled representation commonly conveys that the RNN is being used many times to generate samples."

fjxmlzn commented 4 years ago

The RNN (more specifically, LSTM) has an internal state, which can store past information, so we don't need to explicitly input the previously generated sample into the next time step. By default, our generator will just input a noise variable to the RNN unit at each time step as the activation. Although, our code also supports adding the previously generated sample as an extra input to the RNN unit; you can turn on this feature by setting feedback=True in config.py.

firmai commented 4 years ago

Excellent thanks!