optuna / optuna-examples

Examples for https://github.com/optuna/optuna
MIT License
632 stars 171 forks source link

The problem of DDP training with pytorch lightning #244

Closed 812130247 closed 3 months ago

812130247 commented 3 months ago

I combined optuna with pytorch and found that I used different hyperparameters on each card, but the results were the same. Here's my data: [I 2024-03-08 16:34:45,973] Trial 11 finished with value: 7.734903812408447 and parameters: {'lr': 6.477031732045171e-05, 'lrp': 7.3735426309977e-05, 'momentum': 0.23995553400695774, 'weight_decay': 0.0002490049795520556, 'reg_weight': 1.3615269556074435e-05}. Best is trial 6 with value: 8.340901374816895. [I 2024-03-08 16:34:45,982] Trial 9 finished with value: 7.734903812408447 and parameters: {'lr': 0.000366936214951141, 'lrp': 2.4497796281549478e-06, 'momentum': 0.4832448834264219, 'weight_decay': 0.00010127314510652736, 'reg_weight': 0.0007631575559616638}. Best is trial 6 with value: 8.340901374816895. [I 2024-03-08 14:31:21,734] Trial 8 finished with value: 8.340901374816895 and parameters: {'lr': 0.0009360795011256953, 'lrp': 5.071527261691034e-06, 'momentum': 0.4066387937956065, 'weight_decay': 1.749854409598497e-05, 'reg_weight': 2.883328966912541e-06}. Best is trial 8 with value: 8.340901374816895. [I 2024-03-08 14:31:21,817] Trial 7 finished with value: 8.340901374816895 and parameters: {'lr': 0.0015239172598450602, 'lrp': 4.4419448643739234e-06, 'momentum': 0.3931021317610983, 'weight_decay': 0.000148263027387392, 'reg_weight': 1.0243409389510802e-05}. Best is trial 6 with value: 8.340901374816895. [I 2024-03-08 14:31:21,817] Trial 6 finished with value: 8.340901374816895 and parameters: {'lr': 0.00017132506280220382, 'lrp': 9.145028867699667e-05, 'momentum': 0.21040102256480303, 'weight_decay': 0.000321756576830038, 'reg_weight': 1.8567214165038956e-06}. Best is trial 6 with value: 8.340901374816895.

Shouldn't the parameters for each card be the same?

nzw0301 commented 3 months ago

Please use https://github.com/optuna/optuna/discussions because the question is not about optuna-example. You can get fast reply if you share the minimal reproducible code when asking a questions.

Best,