awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.56k stars 748 forks source link

Predict issue after loading (deserialized) saved predictor #1242

Closed hannahyess closed 3 years ago

hannahyess commented 3 years ago

Issue Description

I saved trained predictor using predictor.serialize(Path("/tmp/")). But after loading it back using predictor = Predictor.deserialize(Path("/tmp/")), it encounters below issue when performing forecasting. Before saving and loading back, the predictos works perfectly fine.

from pathlib import Path
from gluonts.model.predictor import Predictor
predictor = Predictor.deserialize(Path("/tmp/"))

from gluonts.evaluation.backtest import make_evaluation_predictions
random.seed(0)
np.random.seed(0)
mx.random.seed(0)
forecast_it, ts_it = make_evaluation_predictions(
                        dataset=test_ds,
                        predictor=predictor,
                        num_samples=100
                        )

tss = list(ts_it)
forecasts = list(forecast_it)

Error message or code output

TypeError                                 Traceback (most recent call last)
<ipython-input-45-2c0f20b4f7ea> in <module>
      9                         )
     10 
---> 11 tss = list(ts_it)
     12 forecasts = list(forecast_it)

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonts/evaluation/backtest.py in ts_iter(dataset)
     80 
     81     def ts_iter(dataset: Dataset) -> pd.DataFrame:
---> 82         for data_entry in add_ts_dataframe(iter(dataset)):
     83             yield data_entry["ts"]
     84 

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonts/evaluation/backtest.py in add_ts_dataframe(data_iterator)
     67         data_iterator: Iterator[DataEntry],
     68     ) -> Iterator[DataEntry]:
---> 69         for data_entry in data_iterator:
     70             data = data_entry.copy()
     71             index = pd.date_range(

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonts/dataset/common.py in __iter__(self)
    254 
    255             data = data.copy()
--> 256             data = self.process(data)
    257             data["source"] = SourceContext(source=source_name, row=row_number)
    258             yield data

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonts/dataset/common.py in __call__(self, data)
    455     def __call__(self, data: DataEntry) -> DataEntry:
    456         for t in self.trans:
--> 457             data = t(data)
    458         return data
    459 

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonts/dataset/common.py in __call__(self, data)
    378         value = data.get(self.name, None)
    379         if value is not None:
--> 380             value = np.asarray(value, dtype=self.dtype)
    381 
    382             if self.req_ndim != value.ndim:

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

TypeError: float() argument must be a string or a number, not 'Timestamp'

Environment

(Add as much information about your environment as possible, e.g. dependencies versions.)

lostella commented 3 years ago

Hi @hannahyess! Could you provide a little more information? For example, what version of gluonts/mxnet are you using?

What predictor is giving you trouble? Did you train it out of an estimator? Which one?

If you could attach the serialized predictor (for example, zipping the files that are generated) that would be ideal!

hannahyess commented 3 years ago

Hi @lostella, i installed this version of gluonts: !pip install --upgrade mxnet==1.6 gluonts. I'm training it on AWS Sagemaker instance of kernel conda_mxnet_p36. The predictor trained is DeepAR estimator.

hannahyess commented 3 years ago

Hi @lostella, i actually figured out. it is not because of the deserialised model, but because of the inference data format i used that caused issue. Thanks! Closing this issue.

lostella commented 3 years ago

@hannahyess nice! One note: using directly /temp/ as path to serialize a model is not ideal, since the model will be composed of a few different files and directories. It’s better to create a dedicated directory there (or anywhere), or if the location is meant to be temporary then it’s best to use the tempfile module from Python: https://docs.python.org/3/library/tempfile.html#tempfile.TemporaryDirectory

hannahyess commented 3 years ago

ok! Thanks @lostella

tanweer-mahdi commented 1 year ago

@hannahyess could you please elaborate what caused that error? I am getting slightly different one: float() argument must be a string or a number, not 'NAType'

However, I don't have any NA in my dataset!