gzerveas / mvts_transformer

Multivariate Time Series Transformer, public version
MIT License
752 stars 173 forks source link

forecaster.predict results #18

Closed luis-gnz11 closed 2 years ago

luis-gnz11 commented 2 years ago

I've launched one epoch training for the toy2 dataset and modified the train.py code to call forecaster.predict twice for the fisrt test data sample:

xc, yc, xt, _ = test_samples yt_pred1 = forecaster.predict(xc, yc, xt) print("yt_pred 1[0][0]:") print(yt_pred1[0][0]) yt_pred2 = forecaster.predict(xc, yc, xt) print("yt_pred 2[0][0]:") print(yt_pred2[0][0])

But I'm getting two different predict results with the same input (same xc, yc and yt):

yt_pred 1[0][0]: tensor([ 0.2833, 0.2584, 0.3955, 0.1239, 0.1491, -0.2220, 0.3673, 0.1451, 0.0191, 0.0947, 0.4993, -0.2045, 0.2724, 0.0498, 0.0839, 0.2188, 0.0291, -0.0505, 0.2537, 0.2825]) yt_pred 2[0][0]: tensor([-0.0851, 0.1524, 0.1037, -0.0464, -0.1989, 0.0934, 0.0636, 0.0913, 0.2973, 0.0513, 0.3559, 0.1850, 0.1016, 0.1844, 0.5109, 0.0665, 0.2945, 0.3052, 0.3375, 0.1235])

Why two different predictions? What am I missing?

gzerveas commented 2 years ago

Hi, I haven't looked at your code (have you pushed as a branch / pull request?), but typically when this happens during inference it means that you haven't called model.eval() to switch to evaluation mode prior to calling the model. You have to do this for a deterministic output, e.g. in order to ensure that no dropout is used and that batch normalization is used properly.

luis-gnz11 commented 2 years ago

Exactly, the non-deterministic output was probably because of the dropouts/batch normalization were still active. A call to forecaster.eval() before the inference solved the problem. I thought that, when using pytorch lightning, all calls to .train() and .eval() were made implicitly. Thks George.