titu1994 / MLSTM-FCN

Multivariate LSTM Fully Convolutional Networks for Time Series Classification
482 stars 180 forks source link

Data shape for Ozone dataset #11

Open jmrichardson opened 5 years ago

jmrichardson commented 5 years ago

Thank you for sharing the code. I am trying to follow the code but confused about the data shape for example the ozone data set. The UCI ozone data set contains 2,534 days of data and 72 independent variables. Looking at the dataset which is derived from a matlab matrix the resulting data set shape is:

Train dataset :  (173, 72, 291) 
Test dataset :  (173, 72, 291) 

I see the 72 features (N) and the 291 time steps (M) but not sure about 173 and how this shape was derived. In the "generate_ozone_dataset.py" code, it looks like there is some zero padding for multi-length data but I don't understand how/why timesteps would vary. In other words, why wouldn't timesteps be a loopback window of previous ozone observations? The data in the matlab matrix is already normalized so I can't see any logic as to a multi-length time step and reason to pad with zeros. Also not sure about the first dimension of 173 and how it was derived?

Thanks for any insight.