sktime / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
4k stars 632 forks source link

'TimeSeriesDataSet' object has no attribute 'args' #114

Closed eduDorus closed 4 years ago

eduDorus commented 4 years ago

When I try to create a TimeSeriesDataSet for validation it shows me the error in the title. I tried:

validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True) validation = TimeSeriesDataSet.from_dataset(training, data, min_prediction_idx=training.index.time.max() + 1, stop_randomization=True)

The training dataset is created successfully. Any ideas what could be wrong?

jdb78 commented 4 years ago

Could you post the full traceback and potentially the code here? What are you PyTorch and PyTorch Forecasting versions?

eduDorus commented 4 years ago

Thank you for the quick response 👍

Versions: torch 1.7.0a0+8deb4fe pytorch-forecasting 0.5.2 pytorch-lightning 1.0.2

Code: max_prediction_length = 6 max_encoder_length = 24 training_cutoff = data["time_idx"].max() - max_prediction_length

training = TimeSeriesDataSet( data[lambda x: x.time_idx <= training_cutoff], time_idx="time_idx", target="logReturns", group_ids=["symbol"], min_encoder_length=max_encoder_length // 2, # keep encoder length long (as it is in the validation set) max_encoder_length=max_encoder_length, min_prediction_length=1, max_prediction_length=max_prediction_length, static_categoricals=["symbol"],

static_reals=["avg_population_2017", "avg_yearly_household_income_2017"],

time_varying_known_categoricals=["day", "hour"],
#variable_groups={"special_days": special_days},  # group of categorical variables can be treated as one variable
time_varying_known_reals=["time_idx"],
time_varying_unknown_categoricals=[],
time_varying_unknown_reals=[
    "close",
    "volume"
],
target_normalizer=GroupNormalizer(
    groups=["symbol"], coerce_positive=1.0
),  # use softplus with beta=1.0 and normalize by group
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,

)

create validation set (predict=True) which means to predict the last max_prediction_length points in time for each series

validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True)

create dataloaders for model

batch_size = 1 # set this between 32 to 128 train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0) val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10, num_workers=0)

Stacktrace:

AttributeError Traceback (most recent call last)

in 36 37 # create validation set (predict=True) which means to predict the last max_prediction_length points in time for each series ---> 38 validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True) 39 40 # create dataloaders for model /opt/conda/lib/python3.6/site-packages/pytorch_forecasting/data/timeseries.py in from_dataset(cls, dataset, data, stop_randomization, predict, **update_kwargs) 641 """ 642 return cls.from_parameters( --> 643 dataset.get_parameters(), data, stop_randomization=stop_randomization, predict=predict, **update_kwargs 644 ) 645 /opt/conda/lib/python3.6/site-packages/pytorch_forecasting/data/timeseries.py in get_parameters(self) 613 """ 614 kwargs = { --> 615 name: getattr(self, name) for name in inspect.signature(self.__class__).parameters.keys() if name != "data" 616 } 617 kwargs["categorical_encoders"] = self.categorical_encoders /opt/conda/lib/python3.6/site-packages/pytorch_forecasting/data/timeseries.py in (.0) 613 """ 614 kwargs = { --> 615 name: getattr(self, name) for name in inspect.signature(self.__class__).parameters.keys() if name != "data" 616 } 617 kwargs["categorical_encoders"] = self.categorical_encoders AttributeError: 'TimeSeriesDataSet' object has no attribute 'args'
eduDorus commented 4 years ago

It seems there is a problem with the get_parameters() function:

dataset.get_parameters() ->

AttributeError Traceback (most recent call last)

in ----> 1 training.get_parameters() /opt/conda/lib/python3.6/site-packages/pytorch_forecasting/data/timeseries.py in get_parameters(self) 613 """ 614 kwargs = { --> 615 name: getattr(self, name) for name in inspect.signature(self.__class__).parameters.keys() if name != "data" 616 } 617 kwargs["categorical_encoders"] = self.categorical_encoders /opt/conda/lib/python3.6/site-packages/pytorch_forecasting/data/timeseries.py in (.0) 613 """ 614 kwargs = { --> 615 name: getattr(self, name) for name in inspect.signature(self.__class__).parameters.keys() if name != "data" 616 } 617 kwargs["categorical_encoders"] = self.categorical_encoders AttributeError: 'TimeSeriesDataSet' object has no attribute 'args'
jdb78 commented 4 years ago

What do you get if you execute import inspect; list(inspect.signature(TimeSeriesDataSet).parameters.keys())? There is no "args" in my list. Are you, by any chance, working with Python 3.9? It is not being tested for but I will add it to the test matrix.

eduDorus commented 4 years ago

If I execute the above statement I get the following: ['args', 'kwds']

Python Version is: Python 3.6.10 :: Anaconda, Inc.

eduDorus commented 4 years ago

I could fix it with the following: inspect.signature(self.__init__).parameters.keys()

The problem is what you were suspecting. Args and Kwargs were in the array of parameters.

Kimonili commented 3 years ago

I have a similar issue to that for which I cannot find a solution. I have a custom time series dataset class that inherits from the time series dataset class from this repo.

My sub-class uses the entire functionality of the parent class having one extra argument inside the init function (this extra is irrelevant to the issue). The rest of the functionality of the parent class are retrieved with super(). To better understand the init function from which I am trying to get the parameters, this is the code:

def __init__(self, mapping_paths: list, **kwargs):
        self.map_dict = {}
        self.mapping_paths = mapping_paths
        for path in self.mapping_paths:
            with open(path, 'rb') as f:
                self.map_dict.update(pickle.load(f))
        super().__init__(**kwargs)

The error that I get is the following:

Exception has occurred: AttributeError (note: full exception trace is shown but execution is paused at: _run_module_as_main) 'CustomTimeSeriesDataset' object has no attribute 'kwargs'

Is there a way for the program to read **kwargs from the parent class and pass them all to the getattr() method?

Kimonili commented 3 years ago

I found a workaround for this problem - I am sure its not the best but it worked out for me:

keywargs = {
            name: getattr(self, name)
            for name in inspect.signature(TimeSeriesDataSet.__init__).parameters.keys()
            if name not in ["data", "self"]
        }

for name in inspect.signature(self.__class__.__init__).parameters.keys():
    if name not in ['kwargs', 'self']:
        keywargs.update({name: getattr(self, name)})

In this way the keywargs dictionary gets all the arguments of the parent class and the arguments of the child class. Still believe that the best way to make this work is to find a way to automatically read all the arguments of the parent class when the program runs into the **kwargs argument in the child class, but I am not sure if this is possible.