Open strakehyr opened 2 years ago
@strakehyr looking at the TemporalFusionTransformerEstimator
code, I can see that it requires one to explicitly set the names (and dimensions) of additional feature fields in the data. For example, for dynamic numerical features, this is where the feature fields are added: https://github.com/awslabs/gluon-ts/blob/b0d0c41cff7d8a8ab167fc16165477ba9e73329a/src/gluonts/mx/model/tft/_estimator.py#L256-L264
I think this kind of model configuration is not so bad, but it's very different from other models, which just silently assume certain field names to refer to this or that type of feature, and just use them in case they are present.
I believe this is something we will probably modify in the future, and make sure that all models are consistent in this kind of options.
cc @jaheba
@strakehyr you may want to try to configure the estimator as in the following example:
from typing import List
from gluonts.dataset.common import ListDataset
from gluonts.mx import TemporalFusionTransformerEstimator
dataset = ListDataset(
[{
"start": "2021-01-01 00",
"target": [1.0] * 200,
"feat_dynamic_real": [[1.0] * 200] * 3,
}],
freq="1H"
)
estimator = TemporalFusionTransformerEstimator(
freq="1H",
prediction_length=24,
dynamic_feature_dims={"feat_dynamic_real": 3}
)
predictor = estimator.train(dataset)
(Note the dynamic_feature_dims
option)
Hi @lostella, thanks for your help. I tried to run your code but ran into:
predictor = estimator.train(train_ds)
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Temp/ipykernel_50220/3134606962.py", line 1, in <module>
predictor = estimator.train(train_ds)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\mx\model\estimator.py", line 220, in train
return self.train_model(
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\mx\model\estimator.py", line 177, in train_model
training_data_loader = self.create_training_data_loader(
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\model\tft\_estimator.py", line 342, in create_training_data_loader
with env._let(max_idle_transforms=maybe_len(data) or 0):
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\itertools.py", line 45, in maybe_len
return len(obj)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 100, in __len__
return sum(1 for _ in self)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 100, in <genexpr>
return sum(1 for _ in self)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 103, in __iter__
yield from self.transformation(
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 124, in __call__
for data_entry in data_it:
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 128, in __call__
raise e
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 126, in __call__
yield self.map_transform(data_entry.copy(), is_train)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\_base.py", line 141, in map_transform
return self.transform(data)
File "C:\Users\user\Anaconda3\lib\site-packages\gluonts\transform\convert.py", line 209, in transform
output = np.vstack(r) if not self.h_stack else np.hstack(r)
File "<__array_function__ internals>", line 5, in vstack
File "C:\Users\user\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 282, in vstack
return _nx.concatenate(arrs, 0)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 13345 and the array at index 1 has size 108
Do you have any advice on the long dataframe issue?
@strakehyr I've run into issues using the train() method with covariates as well. I think it's because the TFT decoder expects future observations for future known covariates (vs ones known only in the past, which only need to be available to the encoder). I've found using make_evaluation_predictions() instead works since it already holds the prediction length of the target without dropping the future known covariates. So use the following for scoring your model:
forecast_it, ts_total_it = make_evaluation_predictions( dataset=ds_predict, predictor=predictor )
where ds_predict = training data + scoring data.
@strakehyr I've run into issues using the train() method with covariates as well. I think it's because the TFT decoder expects future observations for future known covariates (vs ones known only in the past, which only need to be available to the encoder). I've found using make_evaluation_predictions() instead works since it already holds the prediction length of the target without dropping the future known covariates. So use the following for scoring your model:
forecast_it, ts_total_it = make_evaluation_predictions( dataset=ds_predict, predictor=predictor )
where ds_predict = training data + scoring data.
Indeed that is how I run predictions.
Hi all, when training a TFT I use, as seen in other notebooks the following code:
Upon use of said model, I found out that changing all the covariates (dynamic real) in predictions changes absolutely nothing. Is this expected behaviour with the current code? Am I supposed to prepare data differently? TFT is just not making use of the GRNetwork between covariates when prepared like this.
When preparing with a long dataframe,
I currently get the error: