awslabs / gluonts

Probabilistic time series modeling in Python
https://ts.gluon.ai
Apache License 2.0
4.55k stars 747 forks source link

How to save the trained estimator #126

Closed Hari-pyt closed 5 years ago

Hari-pyt commented 5 years ago

HI I ddin't find any documentation related to saving the trained model. I want to save my model for further usage.

Thanks

geoalgo commented 5 years ago

Given a predictor, (see this link for instance to see how to get one), you can do:

predictor = estimator.train(dataset.train)

# save the trained model in tmp/
from pathlib import Path
predictor.serialize(Path("/tmp/"))

# loads it back
from gluonts.model.predictor import Predictor
predictor_deserialized = Predictor.deserialize(Path("/tmp/"))
Hari-pyt commented 5 years ago

Hi geoalgo I tried to save the model in traditional way predictor.serialize("/tmp/") but it didnt work

thank you

lostella commented 5 years ago

@Hari7696 how doesn’t it work? Can you paste the error that you obtain? What happens? What OS and Python version are you running?

lostella commented 5 years ago

I’m reopening the issue since this is now a documentation problem

mbohlkeschneider commented 5 years ago

Hi @Hari7696. Can you make sure that the path you are saving to exists? I just tested the code from @geoalgo and it worked fine on MacOS Sierra.

Hari-pyt commented 5 years ago

Hi @lostella @mbohlkeschneider

I am getting the folllowing error when I dont use the Path from pathlib

predictor.serialize("/tmp/")

TypeError                                 Traceback (most recent call last)
~/.local/lib/python3.7/site-packages/gluonts/model/predictor.py in serialize(self, path)
     91             # serialize Predictor type
---> 92             with (path / 'type.txt').open('w') as fp:
     93                 fp.write(fqname_for(self.__class__))

TypeError: unsupported operand type(s) for /: 'str' and 'str'

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
<ipython-input-509-b19f9136069e> in <module>
----> 1 predictor.serialize("/tmp/")

~/.local/lib/python3.7/site-packages/gluonts/model/predictor.py in serialize(self, path)
    324     def serialize(self, path: Path) -> None:
    325         # call Predictor.serialize() in order to serialize the class name
--> 326         super().serialize(path)
    327 
    328         # serialize every GluonPredictor-specific parameters

~/.local/lib/python3.7/site-packages/gluonts/model/predictor.py in serialize(self, path)
     95             raise IOError(
     96                 f'Cannot serialize {fqname_for(self.__class__)}'
---> 97             ) from e
     98 
     99     @classmethod

OSError: Cannot serialize gluonts.model.predictor.RepresentableBlockPredictor

But when I use Path from pathlib

predictor.serialize(Path("/tmp/"))

I am not having any issue.

O.S : Ubuntu 16.04 conda environment, python 3.7.3

mbohlkeschneider commented 5 years ago

Great! Hope that helped.

jaheba commented 5 years ago

@Hari7696

Can you use ``` for code blocks? That makes it easier to read your posts.

aamrb96 commented 3 years ago

Hi everyone,

I have a similar issue but compared to the Hari7696, I would like to actually save the estimator, to re-use for training.

My workflow looks something like this:

# This outputs a dictionary of 6 dataframes
data_frames: dict = load_data(file_path) 

# Outputs a gluonts estimator object given a specific config
nbeats_estimator = define_nbeats_estimator(config) 

# Iterate through the different dataframes and subsequently train nbeats model.
for key in list(data_frames.keys()):
  trained_nbeats = nbeats_estimator.train(data_frames[key])

However, this way the model gets initialized over and over again and the model is only trained on the last dataset in my dictionary. However, I would like to save the current state of the estimator, and use it to estimate the next data set.

Is it possible to do this somehow? This would really help me.

Thanks in advance!