awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.54k stars 747 forks source link

ISSM in DeepState #851

Closed parimuns closed 4 years ago

parimuns commented 4 years ago

Does using "issm" in DeepState estimator will change the model performance and how to use the issm and time_features in estimator ? I am testing this on electricity dataset.

estimator1=DeepStateEstimator(freq='H',
        prediction_length=24,
        cardinality=[370],
        add_trend= True,
        num_periods_to_train= 4,
        trainer= Trainer(epochs=10, num_batches_per_epoch=50, hybridize=False),
        num_layers= 2,
        num_cells= 40,
        cell_type= "lstm",
        num_parallel_samples = 100,
        dropout_rate= 0.1,
        use_feat_dynamic_real = False,
        use_feat_static_cat = True,
        embedding_dimension = None,
        issm =ISSM ,
        scaling = True,
        time_features = ["H"])

its giving error

'ValidationError: 2 validation errors for DeepStateEstimatorModel
issm
  instance of ISSM expected (type=type_error.arbitrary_type; expected_arbitrary_type=ISSM)
time_features -> 0
  instance of TimeFeature expected (type=type_error.arbitrary_type; expected_arbitrary_type=TimeFeature)'
manujosephv commented 4 years ago

These are the available ISSMs in the implementation.

level : LevelISSM()
level_trend : LevelTrendISSM()
seasonality : SeasonalityISSM(<seasonality_period>)
composite : This is the default one. If pass None, this is selected.

Another problem you have is that you are passing the class, but you need to pass the instance. For eg, instead of LevelISSM, it should be LevelISSM()

parimuns commented 4 years ago

These are the available ISSMs in the implementation.

level : LevelISSM()
level_trend : LevelTrendISSM()
seasonality : SeasonalityISSM(<seasonality_period>)
composite : This is the default one. If pass None, this is selected.

Another problem you have is that you are passing the class, but you need to pass the instance. For eg, instead of LevelISSM, it should be LevelISSM()

Thanks a lot for answering.One more issue I am facing with Deep State is that if I have multivariate data ,suppose I have electricity data of 8 zones with load (target) and temperature (feat_dynamic_real).I have "cardinality"=8,Feat_static_cat=True,feat_dynamic_real=True ,can I use such kindof data with DeepStateEstimator because when I am running this,it goes out of memory.

estimator=DeepStateEstimator(freq='H',
        prediction_length=24,
         cardinality=[8],
        add_trend= False,
        num_periods_to_train= 4,
        trainer= Trainer(
            epochs=1, num_batches_per_epoch=50, hybridize=False),
        num_layers= 2,
        num_cells= 40,
        cell_type= "lstm",
        num_parallel_samples = 100,
        dropout_rate= 0.1,
        use_feat_dynamic_real = True,
        use_feat_static_cat = True,
        embedding_dimension = None,
        issm =None ,
        scaling = True,
        time_features = None)

MemoryError                               Traceback (most recent call last)
<ipython-input-129-44c508c2f597> in <module>
----> 1 predictor = estimator.train(train_ds)

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\model\estimator.py in train(self, training_data, validation_data)
    221         self, training_data: Dataset, validation_data: Optional[Dataset] = None
    222     ) -> Predictor:
--> 223         return self.train_model(training_data, validation_data).predictor

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\model\estimator.py in train_model(self, training_data, validation_data)
    206             input_names=get_hybrid_forward_input_names(trained_net),
    207             train_iter=training_data_loader,
--> 208             validation_iter=validation_data_loader,
    209         )
    210 

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\trainer\_base.py in __call__(self, net, input_names, train_iter, validation_iter)
    295                     )
    296 
--> 297                     epoch_loss = loop(epoch_no, train_iter)
    298                     if is_validation_available:
    299                         epoch_loss = loop(

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\trainer\_base.py in loop(epoch_no, batch_iter, is_training)
    228 
    229                     with tqdm(batch_iter) as it:
--> 230                         for batch_no, data_entry in enumerate(it, start=1):
    231                             if self.halt:
    232                                 break

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\tqdm\std.py in __iter__(self)
   1125 
   1126         try:
-> 1127             for obj in iterable:
   1128                 yield obj
   1129                 # Update and possibly print the progressbar.

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\dataset\loader.py in __iter__(self)
    190         assert self._cur_iter is not None
    191         while True:
--> 192             data_entry = next(self._cur_iter)
    193             self._buffer.add(data_entry)
    194             if (

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\transform.py in __call__(self, data_it, is_train)
    322     ) -> Iterator:
    323         num_idle_transforms = 0
--> 324         for data_entry in data_it:
    325             num_idle_transforms += 1
    326             try:

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\transform.py in __call__(self, data_it, is_train)
    277                 yield self.map_transform(data_entry.copy(), is_train)
    278             except Exception as e:
--> 279                 raise e
    280 
    281     @abc.abstractmethod

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\transform.py in __call__(self, data_it, is_train)
    275         for data_entry in data_it:
    276             try:
--> 277                 yield self.map_transform(data_entry.copy(), is_train)
    278             except Exception as e:
    279                 raise e

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\transform.py in map_transform(self, data, is_train)
    290 
    291     def map_transform(self, data: DataEntry, is_train: bool) -> DataEntry:
--> 292         return self.transform(data)
    293 
    294     @abc.abstractmethod

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\gluonts\transform.py in transform(self, data)
    524             if data[fname] is not None
    525         ]
--> 526         output = np.vstack(r)
    527         data[self.output_field] = output
    528         for fname in self.cols_to_drop:

c:\users\parul\appdata\local\programs\python\python37\lib\site-packages\numpy\core\shape_base.py in vstack(tup)
    281     """
    282     _warn_for_nonsequence(tup)
--> 283     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
    284 
    285 

MemoryError:
lostella commented 4 years ago

One more issue I am facing with Deep State is that if I have multivariate data

@parimuns can you include code snippets and error traces in triple back-ticks please?

DeepStateEstimator supports only univariate time series, I believe. What do you mean by “multivariate”? Do you have one eight-dimensional time series in your dataset or eight one-dimensional ones?

Can you add a compelte example to reproduce the problem? Something that creates fake data for the purpose is fine, you can use ListDataset to create a fake dataset that resembles yours in terms of number/size of time series.

parimuns commented 4 years ago

One more issue I am facing with Deep State is that if I have multivariate data

@parimuns can you include code snippets and error traces in triple back-ticks please?

DeepStateEstimator supports only univariate time series, I believe. What do you mean by “multivariate”? Do you have one eight-dimensional time series in your dataset or eight one-dimensional ones?

Can you add a compelte example to reproduce the problem? Something that creates fake data for the purpose is fine, you can use ListDataset to create a fake dataset that resembles yours in terms of number/size of time series.

Hello Thanks @lostella . My dataset is 2 dimensional 8 time-series .I have attached the sample data of 1 year with the code.After I run predictor,it gives me error "Memory Error".You can replicate this code and check.In this data , there are 8 zones, therefore Zone-Id varies from 1 to 8.

import mxnet as mx
import gluonts
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
import os
from itertools import islice
from pathlib import Path
load_train = pd.read_csv('iso_train.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
load_train.head()
num_steps= 24 
prediction_length= 24
freq= "1H"
train_ds = ListDataset([{FieldName.TARGET:load_train.Demand,
                         FieldName.START:load_train.index[0],
                         FieldName.FEAT_DYNAMIC_REAL:load_train.Temp,
                         FieldName.FEAT_STATIC_CAT:load_train.Zone,
                        }],
                      freq="H")
test = pd.read_csv('iso_test.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
test.head()
test_ds = ListDataset([{FieldName.TARGET:test.Demand,
                         FieldName.START:test.index[0],
                         FieldName.FEAT_DYNAMIC_REAL:test.Temp,
                         FieldName.FEAT_STATIC_CAT:test.Zone,
                        }],
                      freq="H")
train_entry = next(iter(train_ds))
train_entry.keys()
test_entry = next(iter(test_ds))
test_entry.keys()
from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer
estimator=DeepStateEstimator(freq='H',
        prediction_length=24,
         cardinality=[8],
        add_trend= False,
        num_periods_to_train= 4,
        trainer= Trainer(
            epochs=1, num_batches_per_epoch=50, hybridize=False),
        num_layers= 2,
        num_cells= 40,
        cell_type= "lstm",
        num_parallel_samples = 100,
        dropout_rate= 0.1,
        use_feat_dynamic_real = True,
        use_feat_static_cat = True,
        embedding_dimension = None,
        issm =None ,
        scaling = True,
        time_features = None)
predictor = [estimator.train(train_ds)]

data.zip

manujosephv commented 4 years ago

One more issue I am facing with Deep State is that if I have multivariate data

@parimuns can you include code snippets and error traces in triple back-ticks please? DeepStateEstimator supports only univariate time series, I believe. What do you mean by “multivariate”? Do you have one eight-dimensional time series in your dataset or eight one-dimensional ones? Can you add a compelte example to reproduce the problem? Something that creates fake data for the purpose is fine, you can use ListDataset to create a fake dataset that resembles yours in terms of number/size of time series.

Hello Thanks @lostella . My dataset is 2 dimensional 8 time-series .I have attached the sample data of 1 year with the code.After I run predictor,it gives me error "Memory Error".You can replicate this code and check.In this data , there are 8 zones, therefore Zone-Id varies from 1 to 8.

import mxnet as mx
import gluonts
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
import os
from itertools import islice
from pathlib import Path
load_train = pd.read_csv('iso_train.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
load_train.head()
num_steps= 24 
prediction_length= 24
freq= "1H"
train_ds = ListDataset([{FieldName.TARGET:load_train.Demand,
                         FieldName.START:load_train.index[0],
                         FieldName.FEAT_DYNAMIC_REAL:load_train.Temp,
                         FieldName.FEAT_STATIC_CAT:load_train.Zone,
                        }],
                      freq="H")
test = pd.read_csv('iso_test.csv',parse_dates=[['Date', 'Time']],header=0,index_col=0)
test.head()
test_ds = ListDataset([{FieldName.TARGET:test.Demand,
                         FieldName.START:test.index[0],
                         FieldName.FEAT_DYNAMIC_REAL:test.Temp,
                         FieldName.FEAT_STATIC_CAT:test.Zone,
                        }],
                      freq="H")
train_entry = next(iter(train_ds))
train_entry.keys()
test_entry = next(iter(test_ds))
test_entry.keys()
from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer
estimator=DeepStateEstimator(freq='H',
        prediction_length=24,
         cardinality=[8],
        add_trend= False,
        num_periods_to_train= 4,
        trainer= Trainer(
            epochs=1, num_batches_per_epoch=50, hybridize=False),
        num_layers= 2,
        num_cells= 40,
        cell_type= "lstm",
        num_parallel_samples = 100,
        dropout_rate= 0.1,
        use_feat_dynamic_real = True,
        use_feat_static_cat = True,
        embedding_dimension = None,
        issm =None ,
        scaling = True,
        time_features = None)
predictor = [estimator.train(train_ds)]

data.zip

You are using the library wrong. You have 8 time series and these eight time series should be added separately in the ListDataset. You can take a look at a few examples in the "examples" folder in the repo. COVID19 or the M5 one are good examples to get the hang of the library.

Also the Extended Tutorial in GluonTS documentation also is a good learning resource

Other than that I don't see a problem. Did you check if the memory error is because of a low RAM situation?

parimuns commented 4 years ago

Thanks a lot.I will check the examples and get back to you in case of any issue.