Running DeepAR in loops gives different results from running the non-first experiment directly on GPU

Arfea commented 3 years ago

Hello, I am required to run DeepAR with nested cross-validation and hyperparameter tuning, and I tried running the example code in a simple loop to check reproducibility, and found that the second experiment (i.e. the one with 2 layers of LSTM) has different results when it is run in a loop compared to it is run directly (i.e. running the following code with for l in [2]:). I notice this only happens on GPU (I use Tesla V100 GPU and cuda 10.2 ). I was wondering if anyone knows how to fix the cuda random seed in mxnet? Many thanks!


from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer
import random
from gluonts.core.component import get_mxnet_context
import pandas as pd
import matplotlib.pyplot as plt
from gluonts.dataset.common import ListDataset
from itertools import islice
from gluonts.evaluation.backtest import make_evaluation_predictions
from gluonts.evaluation import Evaluator
import mxnet as mx
import numpy as np
mx.random.seed(0)
np.random.seed(0)
url = "https://raw.githubusercontent.com/numenta/NAB/master/data/realTweets/Twitter_volume_AMZN.csv"
df = pd.read_csv(url, header=0, index_col=0)
training_data = ListDataset(
    [{"start": df.index[0], "target": df.value[:"2015-04-05 00:00:00"]}],
    freq = "5min")
test_data = ListDataset(
    [
        {"start": df.index[0], "target": df.value[:"2015-04-10 03:00:00"]},
        {"start": df.index[0], "target": df.value[:"2015-04-15 18:00:00"]},
        {"start": df.index[0], "target": df.value[:"2015-04-20 12:00:00"]}
    ],
    freq = "5min"
)
for l in [1,2]:
    mx.random.seed(0)
    np.random.seed(0)
    random.seed(0)
    print(get_mxnet_context())#gpu(0)
                #mx.random.seed(0, ctx=mx.gpu(0))#same as remove this line
                #mx.random.seed(0, ctx=ctx)
    mx.random.seed(0,get_mxnet_context())
    estimator = DeepAREstimator(freq="5min", 
                            prediction_length=36, 
                            trainer=Trainer(epochs=10),
                            num_layers=l
                           )
    predictor = estimator.train(training_data=training_data)
    forecast_it, ts_it = make_evaluation_predictions(test_data, predictor=predictor, num_samples=100)
    forecasts = list(forecast_it)
    tss = list(ts_it)
    evaluator = Evaluator(quantiles=[0.5], seasonality=2016)

    agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(test_data))
    print('layers: '+str(l))
    print(agg_metrics)
    print('\n')

lynnhuang97 commented 2 years ago

Excuse me, I'm faced with the exact same question. Have you find the solution yet?

mraapshockwavemedical commented 3 months ago

It doesn't seem to be a problem with the seed, because for me the second experiment is always X when ran in loop l in [1,2] and always Y when ran in loop l in [2], where X!=Y is the problem and unexpected behavior. I tried

del estimator, predictor
gc.collect()

but that doesn't fix it either. Btw I'm using TemporalFusionTransformerEstimator.

Any ideas?

awslabs / gluonts

Running DeepAR in loops gives different results from running the non-first experiment directly on GPU #1084