I am trying to follow the tutorial for interpretable forecasting with N-beats but with my own dataset, which is just one univariate time series.The trouble I am facing is creating a validation set using the same normalization techniques as for the training dataset.
I have seen other threads where adding an variable of categorical_encoders={col_name: NaNLabelEncoder(add_nan=True)} should solve the problem but it did not for me, and the error of "Unknown category '{e.args[0]}' encountered. Set add_nan=True to allow unknown categories" still arises.
In the linked tutorial, the col_name is the column that identitfies different time series. Since I only have one time series, I have set this to my actual time series which could be where the problem arises?
Here is how I initialise the TimeSeriesDataSet for the training set.
The error I got for trying to create the validation set is.
Traceback (most recent call last):
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\encoders.py", line 132, in transform
encoded = [self.classes_[v] for v in y]
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\encoders.py", line 132, in <listcomp>
encoded = [self.classes_[v] for v in y]
KeyError: 1000.77
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./nbeats_torch.py", line 46, in <module>
stop_randomization=True)
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\timeseries.py", line 1113, in from_dataset
dataset.get_parameters(), data, stop_randomization=stop_randomization, predict=predict, **update_kwargs
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\timeseries.py", line 1158, in from_parameters
new = cls(data, **parameters)
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\timeseries.py", line 434, in __init__
data = self._preprocess_data(data)
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\timeseries.py", line 747, in _preprocess_data
data[self.target] = self.target_normalizer.transform(data[self.target])
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\encoders.py", line 135, in transform
f"Unknown category '{e.args[0]}' encountered. Set `add_nan=True` to allow unknown categories"
KeyError: "Unknown category '1000.77' encountered. Set `add_nan=True` to allow unknown categories"
1000.77 is the first entry of my validation set.
Any help is appreciated!!
UPDATE:
I have added an variable of target_normalizer=NaNLabelEncoder(add_nan=True) in the initialisation of the training set and now everything works. However, I am getting an user warning as follows:
.\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\encoders.py:121: UserWarning: Found 3628 unknown classes which were set to NaN
UserWarning,
.\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\data\encoders.py:121: UserWarning: Found 55 unknown classes which were set to NaN
And an assertion error for when calling net = NBeats.from_dataset(training, learning_rate=3e-2, weight_decay=1e-2, widths=[32, 512], backcast_loss_ratio=0.1):
Traceback (most recent call last):
File "./nbeats_torch.py", line 54, in <module>
net = NBeats.from_dataset(training, learning_rate=3e-2, weight_decay=1e-2, widths=[32, 512], backcast_loss_ratio=0.1)
File ".\anaconda3\envs\tensorflow_gpuenv\lib\site-packages\pytorch_forecasting\models\nbeats\__init__.py", line 199, in from_dataset
), "only regression tasks are supported - target must not be categorical"
AssertionError: only regression tasks are supported - target must not be categorical
UPDATE2:
By examining closely of the error message and prining training.get_parameters(), I have realised that NaNLabelEncoder() is chosen automatically as the appropriate normaliser by default. Therefore, by choosing an actual appropriate normaliser such as target_normalizer=TorchNormalizer(method='identity') solves the problem.
I am trying to follow the tutorial for interpretable forecasting with N-beats but with my own dataset, which is just one univariate time series.The trouble I am facing is creating a validation set using the same normalization techniques as for the training dataset.
I have seen other threads where adding an variable of
categorical_encoders={col_name: NaNLabelEncoder(add_nan=True)}
should solve the problem but it did not for me, and the error of "Unknown category '{e.args[0]}' encountered. Setadd_nan=True
to allow unknown categories" still arises.In the linked tutorial, the col_name is the column that identitfies different time series. Since I only have one time series, I have set this to my actual time series which could be where the problem arises?
Here is how I initialise the TimeSeriesDataSet for the training set.
And here is how I try to create the validation set.
The error I got for trying to create the validation set is.
1000.77 is the first entry of my validation set.
Any help is appreciated!!
UPDATE: I have added an variable of
target_normalizer=NaNLabelEncoder(add_nan=True)
in the initialisation of the training set and now everything works. However, I am getting an user warning as follows:And an assertion error for when calling
net = NBeats.from_dataset(training, learning_rate=3e-2, weight_decay=1e-2, widths=[32, 512], backcast_loss_ratio=0.1)
:UPDATE2: By examining closely of the error message and prining
training.get_parameters()
, I have realised that NaNLabelEncoder() is chosen automatically as the appropriate normaliser by default. Therefore, by choosing an actual appropriate normaliser such astarget_normalizer=TorchNormalizer(method='identity')
solves the problem.