PlaytikaOSS / tft-torch

A Python library that implements ״Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting״
MIT License
109 stars 17 forks source link

The regression forecast result is that there is only one value. #7

Closed chooron closed 2 years ago

chooron commented 2 years ago

Hello, thanks for your work in the "tft-torch" ! I am facing a problem that using multivariate time series to forecast an univariate time series (similar to the stock dataset). However, I get a bad forecast result when run the model in my dataset, and I don't know what went wrong. The model config is

data_props = {'num_historical_numeric': input_size,
                  'num_future_numeric': 3,
                  'num_static_numeric': 1,
                  }
configuration = {
        'model':
            {
                'dropout': 0.05,
                'state_size': 64,
                'output_quantiles': [0.1, 0.5, 0.9],
                'lstm_layers': 2,
                'attention_heads': 4
            },
        'task_type': 'regression',
        'target_window_start': None,
        'data_props': data_props
}

I don't have static numeric data, so I set 1.0 for the numeric data. and the pytorch-lightning code is

class TFTLeaner(pl.LightningModule):
    def __init__(self, model: TemporalFusionTransformer):
        super(TFTLeaner, self).__init__()
        self.model = model
        self.quantiles_tensor = torch.tensor([0.1, 0.5, 0.9])

    def training_step(self, batch, batch_nb):
        outputs = self.model.forward(batch)
        target = batch['target']
        predicted_quantiles = outputs['predicted_quantiles']
        q_loss, q_risk, _ = tft_loss.get_quantiles_loss_and_q_risk(outputs=predicted_quantiles,
                                                                   targets=target,
                                                                   desired_quantiles=self.quantiles_tensor)
        self.log('train_loss', q_loss, prog_bar=True)
        self.log('val_q_risk_0.1', q_risk[0][0], prog_bar=True)
        self.log('val_q_risk_0.5', q_risk[0][1], prog_bar=True)
        self.log('val_q_risk_0.9', q_risk[0][2], prog_bar=True)
        return {'loss': q_loss}

    def validation_step(self, batch, batch_nb):
        outputs = self.model.forward(batch)
        target = batch['target']
        predicted_quantiles = outputs['predicted_quantiles']
        q_loss, q_risk, _ = tft_loss.get_quantiles_loss_and_q_risk(outputs=predicted_quantiles,
                                                                   targets=target,
                                                                   desired_quantiles=self.quantiles_tensor)
        self.log('val_loss', q_loss, prog_bar=True)
        self.log('val_q_risk_0.1', q_risk[0][0], prog_bar=True)
        self.log('val_q_risk_0.5', q_risk[0][1], prog_bar=True)
        self.log('val_q_risk_0.9', q_risk[0][2], prog_bar=True)
        return {'loss': q_loss}

    def test_step(self, batch, batch_idx):
        return self.validation_step(batch, batch_idx)

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.model.parameters(), lr=0.01, weight_decay=1e-2)
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
        return {"optimizer": optimizer, "lr_scheduler": scheduler}

I hope you can solve my problem!

Dvirbeno commented 2 years ago

Hello @chooron,

Can you please provide more details regarding:

Are you sure there aren't any static attributes associated with your data/use case? Just as an example: If the dataset is composed of time-series data associated with specific entities, you can encode these entities, and these encodings can serve as static (categorical) data (See the examples in the tutorial as a reference).

In the paper this repository is implementing, the static attributes have an essential role as part of the model. Take a look at the Static Covarite Encoding component (both in the paper and in the associated blogpost). They affect the variable selection mechanisms, sequence-to-sequence processing, and representation enrichment.

According to your response, we'll be able to understand if there is a need for a solution that does not require the user in specifying static variables.

chooron commented 2 years ago

Hello @Dvirbeno , Thanks for your reply, you can find the dataset in this link, the "historical_numeric" is the ['close', 'high', 'low', 'open'], the "future_numeric" is the ['year','month','dayOfmonth'], the outcome is 'volume', and there aren't any static attributes.

Dvirbeno commented 2 years ago

In the case of your dataset, if you would have had records of other stocks, as an example, you could encode the identity of the corresponding stock as a static (categorical) input.

I see that your dataset includes only a single stock. We will work on a solution for cases when there aren't any static inputs. In the meantime, you can feed the model with a dummy static input, e.g. a tensor of ones (for each element in the batch).

By the way, it seems reasonable to include both ['year','month','dayOfmonth'] and the volume signal as part of the historical time-series as well.

chooron commented 2 years ago

Thanks for your suggestion, I will try to use other stock records, and use the ['year','month','dayOfmonth'] and volume as historical time-series.