Nixtla / mlforecast

Scalable machine 🤖 learning for time series forecasting.
https://nixtlaverse.nixtla.io/mlforecast
Apache License 2.0
788 stars 74 forks source link

Get different performances on different devices #360

Open kkckk1110 opened 2 weeks ago

kkckk1110 commented 2 weeks ago

What happened + What you expected to happen

Hello, I am using MLforecast, specifically xgboost for a forecasting task. Recently I have found that the same code and data produced different results on two devices. One of them is MACOS and another is Windows. I have made sure that the versions of Mlforecast, optuna, and xgboost packages are the same. Also, I set random seeds and could reproduced the results on a single device. But the results produced by the two devices are different. Why did it happen? Was it normal or just something I haven't noticed went wrong?

Versions / Dependencies

xgboost == 1.5.0, optuna == 3.2.0, mlforecast == 0.11.5

Reproduction script

sampler = optuna.samplers.TPESampler(seed=42) study_app = optuna.create_study(direction='minimize',sampler=sampler) study_app.optimize(objective_app, n_trials=500, callbacks=[logging_callback]) #搜索1000次

params_app = study_app.best_params models_app = [XGBRegressor(random_state=42, n_estimators=500, learning_rate=params_app['learning_rate'], max_depth=params_app['max_depth'], min_child_weight=params_app['min_child_weight'], subsample=params_app['subsample'], colsample_bytree=params_app['colsample_bytree'])]

model_app = MLForecast(models=models_app,freq='MS') # model_app.fit(pd.concat([train,valid],axis=0), id_col='unique_id', time_col='ds', target_col='sales', static_features=[],fitted=True) p = model_app.predict(h, X_df = test.iloc[:,:-1])

Issue Severity

None

jmoralez commented 1 week ago

Hey. Are the sampled parameters the same and just the scores are different? I think the most likely cause is some difference in XGBoost due to multithreading, do your devices have a different number of threads?