unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
8.09k stars 880 forks source link

[BUG] Can't fit a loaded darts RNNModel #1998

Closed Kamal-Moha closed 10 months ago

Kamal-Moha commented 1 year ago

Describe the bug I can't fit and use a saved model after loading it. I get the error "FileNotFoundError: [Errno 2] No such file or directory: '/kaggle/working/darts_logs/Karachi_RNN/_model.pth.tar'" when I try to fit it on new data.

To Reproduce This is how I created the darts RNNModel

%%time
n_params = {'training_length': 13, 'lr': 0.00039982048662377925, 'dropout': 0.28118606540118873, 'input_chunk_length': 6, 'Model': 'GRU', 'batch_size': 48, 'num_loader_workers': 8}
col = 'et0_fao_evapotranspiration'

train = data.loc[:'2023-07-28']
test = data.loc['2023-07-28':]

y_train = TimeSeries.from_series(train[col])
y_test = TimeSeries.from_series(test[col])

scaler = StandardScaler()

transformer = Scaler(scaler)
series_transformed = transformer.fit_transform(y_train)

early_stopper = EarlyStopping("train_loss",min_delta=0.001, patience=10,verbose=False)
callbacks = [early_stopper]

pl_trainer_kwargs = {
    "accelerator": "auto",
    "callbacks": callbacks,
}

model1 = RNNModel(
    input_chunk_length=n_params['input_chunk_length'],
    model=n_params['Model'],
    hidden_dim=20,
    dropout=n_params['dropout'],
    batch_size=n_params['batch_size'],
    n_epochs=300,
    optimizer_kwargs={"lr": n_params['lr']},
    model_name="Karachi_RNN",
    pl_trainer_kwargs=pl_trainer_kwargs,
#     log_tensorboard=True,
    random_state=42,
    training_length=n_params['training_length'],
    force_reset=True,
    save_checkpoints=True
)

model1.fit(
    series=series_transformed, verbose=0,
    num_loader_workers=n_params['num_loader_workers']
          )

preds = model1.predict(n=len(test), series=series_transformed)
n_preds = transformer.inverse_transform(preds)

val = rmse(y_test, n_preds)
print(f'RMSE: {val}')

#saving the model
model1.save('/kaggle/working/evapotranspiration_model.pt')

I have then downloaded this model so that I can use it in a new notebook.

loading the model

evo_model = RNNModel.load('evapotranspiration_model.pt')

Trying to fit new data using the saved model and then make predictions.

json_data = {
    "data_columns" : "weathercode,temperature_2m_max,temperature_2m_min,temperature_2m_mean,apparent_temperature_max,apparent_temperature_min,apparent_temperature_mean,sunrise,sunset,shortwave_radiation_sum,precipitation_sum,rain_sum,snowfall_sum,precipitation_hours,windspeed_10m_max,windgusts_10m_max,winddirection_10m_dominant,et0_fao_evapotranspiration"   
}

print(evo_model)

data_columns = json_data['data_columns']
now = datetime.now() - relativedelta(days=7)
start = now - relativedelta(months=11)
date_string_end = now.strftime('%Y-%m-%d')
date_string_start = start.strftime('%Y-%m-%d')
date_pred = []
for date in pd.date_range(start=datetime.now() - relativedelta(days=6), periods=10):
    date_pred.append(date.strftime('%Y-%m-%d'))

url = "https://archive-api.open-meteo.com/v1/archive"
cities = [
    { "name": "Karachi", "country": "Pakistan", "latitude": 24.8608, "longitude": 67.0104 }
]
cities_df =[]
for city in cities:
    params = {"latitude":city["latitude"],
            "longitude":city['longitude'],
            "start_date": date_string_start,
            "end_date": date_string_end,
            "daily": data_columns,
            "timezone": "GMT",
            "min": date_string_start,
            "max": date_string_end,
    }
    res = requests.get(url, params=params)
    data = res.json()
    df = pd.DataFrame(data["daily"])
    df["latitude"] = data["latitude"]
    df["longitude"] = data["longitude"]
    df["elevation"] = data["elevation"]
    df["country"] = city["country"]
    df["city"] = city["name"]
    cities_df.append(df)
concat_df = pd.concat(cities_df, ignore_index=True)
concat_df.set_index('time', inplace=True)
print(concat_df.columns)
total_hours = concat_df['precipitation_hours'].sum()
concat_df['precipitation_rate'] = concat_df['precipitation_sum']/total_hours

##generate prediction for evo_transpiration
et0_fao_evapotranspiration = TimeSeries.from_series(concat_df['et0_fao_evapotranspiration'].values)
scaler = StandardScaler()
transformer = Scaler(scaler)
series_transformed = transformer.fit_transform(et0_fao_evapotranspiration)
evo_model.fit(
    series=series_transformed, verbose=0,
          )
evo_preds = evo_model.predict(n=10, series=series_transformed)
evo_preds = transformer.inverse_transform(evo_preds)
print(evo_preds)

I get the error "FileNotFoundError: [Errno 2] No such file or directory: '/kaggle/working/darts_logs/Karachi_RNN/_model.pth.tar'" when it tries to execute the line evo_model.fit(series=series_transformed, verbose=0)

I don't understand why its saying FileNotFound because I have already downloaded the 'evapotranspiration_model.pt' model in my computer.

Expected behavior I expected it to execute without any error and make predictions because I have properly saved & loaded the RNNModel. Please help

System (please complete the following information):

Additional context Add any other context about the problem here.

dennisbader commented 1 year ago

Hi @Kamal-Moha, for this (e.g. storing the model on some device and then loading it on another) you should use RNNModel.load_weights(). You can find the docs here.

For this to work on the new device you need to first recreate the model the same way as done on device where you saved the model:

model = RNNModel(
    input_chunk_length=n_params['input_chunk_length'],
    model=n_params['Model'],
    hidden_dim=20,
    dropout=n_params['dropout'],
    batch_size=n_params['batch_size'],
    n_epochs=300,
    optimizer_kwargs={"lr": n_params['lr']},
    model_name="Karachi_RNN",
    pl_trainer_kwargs=pl_trainer_kwargs,
#     log_tensorboard=True,
    random_state=42,
    training_length=n_params['training_length'],
    force_reset=True,
    save_checkpoints=True
)

Then you can load the model like this:

model.load_weights("evapotranspiration_model.pt")

Also, make sure to download both files (the one ending on ".pt" and the one ending on ".pt.ckpt") and have them in the same directory.

Kamal-Moha commented 1 year ago

It still doesn't work @dennisbader even after the suggestions given. But I just don't get it. Why do I have to re-create the model structure again, I have already created & trained my model and happy with the evaluation metrics. So I now want to save the model and use it for deployment purposes. I'm not sure of having a lot of training code appear again in the deployment notebook. I have been using pickle to do that when doing sklearn models and it has been working fine. darts seems to make the process of model deployment really stressful

Kamal-Moha commented 1 year ago

Having both the files '.pt' and the '.pt.ckpt' in the same directory have at least helped to remove the error my code was earlier producing. But when I try to make prediction, it predicts nan values in all which is wrong. Check the code below.

path = '/content/drive/MyDrive/Omdena Projects/Weather Prediction for Pakistan/'
json_data = {
    "data_columns" : "weathercode,temperature_2m_max,temperature_2m_min,temperature_2m_mean,apparent_temperature_max,apparent_temperature_min,apparent_temperature_mean,sunrise,sunset,shortwave_radiation_sum,precipitation_sum,rain_sum,snowfall_sum,precipitation_hours,windspeed_10m_max,windgusts_10m_max,winddirection_10m_dominant,et0_fao_evapotranspiration",
    "evo_model" : f"{path}evapotranspiration_model.pt",
}

# Load evo_model
model = RNNModel.load(json_data['evo_model'])

data_columns = json_data['data_columns']
now = datetime.now() - relativedelta(days=7)
start = now - relativedelta(months=11)
date_string_end = now.strftime('%Y-%m-%d')
date_string_start = start.strftime('%Y-%m-%d')
date_pred = []
for date in pd.date_range(start=datetime.now() - relativedelta(days=6), periods=10):
    date_pred.append(date.strftime('%Y-%m-%d'))

url = "https://archive-api.open-meteo.com/v1/archive"
cities = [
    { "name": "Karachi", "country": "Pakistan", "latitude": 24.8608, "longitude": 67.0104 }
]
cities_df =[]
for city in cities:
    params = {"latitude":city["latitude"],
            "longitude":city['longitude'],
            "start_date": date_string_start,
            "end_date": date_string_end,
            "daily": data_columns,
            "timezone": "GMT",
            "min": date_string_start,
            "max": date_string_end,
    }
    res = requests.get(url, params=params)
    data = res.json()
    df = pd.DataFrame(data["daily"])
    df["latitude"] = data["latitude"]
    df["longitude"] = data["longitude"]
    df["elevation"] = data["elevation"]
    df["country"] = city["country"]
    df["city"] = city["name"]
    cities_df.append(df)
concat_df = pd.concat(cities_df, ignore_index=True)
concat_df.set_index('time', inplace=True)
total_hours = concat_df['precipitation_hours'].sum()
concat_df['precipitation_rate'] = concat_df['precipitation_sum']/total_hours

##generate prediction for evo_transpiration
et0_fao_evapotranspiration = TimeSeries.from_series(concat_df['et0_fao_evapotranspiration'].values)
scaler = StandardScaler()
transformer = Scaler(scaler)
series_transformed = transformer.fit_transform(et0_fao_evapotranspiration)
model.fit(series=series_transformed, verbose=0)
print(model.predict(10))

First kindly tell me if doing it this way is the correct way to use a saved model during deployment. And if yes, explain why my model is predicting nan values in all.

Your help is highly appreciated @dennisbader

madtoinou commented 1 year ago

Hi @Kamal-Moha,

I tried reproducing your problem with the following code snippet:

from darts.models import RNNModel
from datetime import datetime
from dateutil import relativedelta
import pandas as pd
from darts import TimeSeries
from darts.dataprocessing.transformers import Scaler
from sklearn.preprocessing import StandardScaler
from darts.datasets import AirPassengersDataset
import requests

ts = AirPassengersDataset().load().astype("float32")
model_old = RNNModel(6, n_epochs=3, training_length=4)
model_old.fit(ts)
model_old.save("ckpt_name.pt")

then created a folder named ckpt_folder in the directory at the level of the folder containing the notebook and cut-pasted the .pt and .ckpt files into this directory. I can then load the weights and perform inference in another cell (possibly another notebook) with the following:

# Load evo_model
model = RNNModel.load("../ckpt_folder/ckpt_name.pt")

json_data = {
    "data_columns" : "weathercode,temperature_2m_max,temperature_2m_min,temperature_2m_mean,apparent_temperature_max,apparent_temperature_min,apparent_temperature_mean,sunrise,sunset,shortwave_radiation_sum,precipitation_sum,rain_sum,snowfall_sum,precipitation_hours,windspeed_10m_max,windgusts_10m_max,winddirection_10m_dominant,et0_fao_evapotranspiration",
}
data_columns = json_data['data_columns']
now = datetime.now() - relativedelta.relativedelta(days=7)
start = now - relativedelta.relativedelta(months=11)
date_string_end = now.strftime('%Y-%m-%d')
date_string_start = start.strftime('%Y-%m-%d')
date_pred = []
for date in pd.date_range(start=datetime.now() - relativedelta.relativedelta(days=6), periods=10):
    date_pred.append(date.strftime('%Y-%m-%d'))

url = "https://archive-api.open-meteo.com/v1/archive"
cities = [
    { "name": "Karachi", "country": "Pakistan", "latitude": 24.8608, "longitude": 67.0104 }
]
cities_df =[]
for city in cities:
    params = {"latitude":city["latitude"],
            "longitude":city['longitude'],
            "start_date": date_string_start,
            "end_date": date_string_end,
            "daily": data_columns,
            "timezone": "GMT",
            "min": date_string_start,
            "max": date_string_end,
    }
    res = requests.get(url, params=params)
    data = res.json()
    df = pd.DataFrame(data["daily"])
    df["latitude"] = data["latitude"]
    df["longitude"] = data["longitude"]
    df["elevation"] = data["elevation"]
    df["country"] = city["country"]
    df["city"] = city["name"]
    cities_df.append(df)
concat_df = pd.concat(cities_df, ignore_index=True)
concat_df.set_index('time', inplace=True)
total_hours = concat_df['precipitation_hours'].sum()
concat_df['precipitation_rate'] = concat_df['precipitation_sum']/total_hours

##generate prediction for evo_transpiration
et0_fao_evapotranspiration = TimeSeries.from_series(concat_df['et0_fao_evapotranspiration'].values).astype("float32")
scaler = StandardScaler()
transformer = Scaler(scaler)
series_transformed = transformer.fit_transform(et0_fao_evapotranspiration)
model.fit(series=series_transformed, verbose=0)
print(model.predict(10))

Which part of the process to you find counter-intuitive or unclear?

NaN in the forecast are often due to NaN in the training (or inference) dataset, make sure that after converting your data into series, there are no NaN in them (which can happen if dates are missing for example).