Open arsinnius opened 3 years ago
Thanks for the detailed explanation and feedback @arsinnius. I am able to get some improvements when increasing epochs. Unfortunately, it never ends up looking like a smooth curve. Seems like it's difficult to for the neural network to learn.
I'm curious about how you're planning to use the synthetic data? A key strength of the PAR model is being able to model multi-sequence data. Since you only have a single sequence, I'm wondering if it could make sense to model this particular data as tabular and sort the synthetic values by DATE
afterwards.
Eg.
from sdv.tabular import GaussianCopula
model = GaussianCopula(default_distribution='beta')
model.fit(gdp_df)
synthetic_data = model.sample(num_rows=289)
synthetic_data = synthetic_data.sort_values(by=['DATE']).reset_index()
This got me more relative accuracy, though the curve still wasn't smooth.
Hello, I'm turning this into a feature request and slightly renaming it. We'll keep this open to track as we make progress.
Environment details
The code was run on Colab
Problem description
I'm using PAR to model macro variables. First, I modeled the VIX volatility index with no apparent problem. Next, I tried gross domestic product. The original GDP is plotted in the first figure - a smooth curve. The sample is plotted in the second figure. Something is clearly wrong. This raises doubt about the validity of the VIX sample.
The data was downloaded from the FRED website as a csv file and converted to a pandas df. The df has two columns - DATE and GDP.
What I already tried
I tried a second time and got the following:
Here is the code I ran: