time-series-foundation-models / lag-llama

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Apache License 2.0
1.19k stars 140 forks source link

Predictions are quite off from the beginning #88

Open ozanbarism opened 2 months ago

ozanbarism commented 2 months ago

Hi, I am trying to test Lagllama on a prediction task. the dataset i feed is what i use for context. then using that context, i want it to make predictions. however, the first step of the predictions is very off from the last value of the context window. Why might this be the case?

This is my code. class LagLlamaModel: def init(self, checkpoint_path="/Users/ozanbaris/Documents/GitHub/TS-foundation-model/lag_src/lag-llama.ckpt"): self.checkpoint_path = checkpoint_path self.device = torch.device('cpu') self.scaler = MinMaxScaler()

def _prepare_data(self, data, item_id, sampling_rate, target_name='target'):
    # Separate the values and timestamps
    values = data[:, 0]
    timestamps = data[:, -1]

    # Normalize the values
    values = np.array(values).reshape(-1, 1)
    normalized_data = self.scaler.fit_transform(values).flatten()

    # Convert numpy array to pandas dataframe using the provided timestamps
    timestamps = pd.to_datetime(timestamps)
    df = pd.DataFrame(normalized_data, columns=[target_name], index=timestamps)
    df['timestamp'] = df.index
    df['item_id'] = item_id

    # Infer frequency from sampling rate (in seconds)
    freq = f'{sampling_rate}S'

    # Convert DataFrame to ListDataset
    dataset = ListDataset(
        [
            {
                "start": df['timestamp'].iloc[0],
                "target": df[target_name].values,
                "item_id": item_id
            }
        ],
        freq=freq
    )
    return dataset

def get_lag_llama_predictions(self, dataset, prediction_length, context_length, use_rope_scaling=False, num_samples=100):
    ckpt = torch.load(self.checkpoint_path, map_location=self.device)  # Uses GPU since in this Colab we use a GPU.
    estimator_args = ckpt["hyper_parameters"]["model_kwargs"]

    rope_scaling_arguments = {
        "type": "linear",
        "factor": max(1.0, (context_length + prediction_length) / estimator_args["context_length"]),
    }

    estimator = LagLlamaEstimator(
        ckpt_path=self.checkpoint_path,
        prediction_length=prediction_length,
        context_length=context_length,  # Lag-Llama was trained with a context length of 32, but can work with any context length

        # estimator args
        input_size=estimator_args["input_size"],
        n_layer=estimator_args["n_layer"],
        n_embd_per_head=estimator_args["n_embd_per_head"],
        n_head=estimator_args["n_head"],
        scaling=estimator_args["scaling"],
        time_feat=estimator_args["time_feat"],
        rope_scaling=rope_scaling_arguments if use_rope_scaling else None,

        batch_size=1,
        num_parallel_samples=100,
        device=self.device,
    )

    lightning_module = estimator.create_lightning_module()
    transformation = estimator.create_transformation()
    predictor = estimator.create_predictor(transformation, lightning_module)

    forecast_it, ts_it = make_evaluation_predictions(
        dataset=dataset,
        predictor=predictor,
        num_samples=num_samples
    )
    forecasts = list(forecast_it)
    tss = list(ts_it)

    return forecasts, tss

def __call__(self, data, item_id, sampling_rate, prediction_length, use_rope_scaling=False, num_samples=100):
    dataset = self._prepare_data(data, item_id, sampling_rate)
    len_data=len(data)
    forecasts, tss = self.get_lag_llama_predictions(
        dataset, 
        prediction_length, 
        context_length=len_data, 
        use_rope_scaling=use_rope_scaling, 
        num_samples=num_samples
    )

    # Extract the forecasted normalized values
    normalized_forecast = np.array([forecast.mean for forecast in forecasts]).flatten()

    # Denormalize the forecasted values
    denormalized_forecast = self.scaler.inverse_transform(normalized_forecast.reshape(-1, 1)).flatten()

    return denormalized_forecast
ashok-arjun commented 1 month ago

Hi @ozanbarism; the scaling might be the issue. A quick fix is to turn off scaling. Can you try that? Please also follow issue https://github.com/time-series-foundation-models/lag-llama/issues/85 which is related.