microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
13.9k stars 1.81k forks source link

Mismatched hyperparameters between web server display and their actual values #5726

Open WenjieDu opened 6 months ago

WenjieDu commented 6 months ago

Describe the issue:

Environment:

Configuration:

trial: command: enable_tuning=1 pypots-cli tuning --model pypots.imputation.MRNN --train_set ../../data/ettm1/train.h5 --val_set ../../data/ettm1/val.h5 codeDir: . gpuNum: 1

localConfig: useActiveGpu: true maxTrialNumPerGpu: 20 gpuIndices: 3

 - Search space: 
```json
{
  "n_steps":  {"_type":"choice","_value":[60]},
  "n_features":  {"_type":"choice","_value":[7]},
  "patience":  {"_type":"choice","_value":[10]},
  "epochs":  {"_type":"choice","_value":[200]},
  "rnn_hidden_size":  {"_type":"choice","_value":[16,32,64,128,256,512]},
  "lr":{"_type":"loguniform","_value":[0.0001,0.01]}
}

Log message:

How to reproduce it?:

Note that in the nnimanager.log: lr of trial XsB6F is 0.0008698020401037771 and this is also the value displayed on the local web page, but in the nnictl stdout log, the actual lr received by the model is 0.0054442307300676335, and they're mismatched. This is not a single case, I notice that hyperparameters of some trials are mismatched between the nnimanager tells and their actual values, while some of them are matched and fine.

axinbme commented 6 months ago

I had the same problem.

void-echo commented 4 months ago

Plus one 🤣

WenjieDu commented 1 month ago

Seriously? Nobody takes care of this high-risk issue?

sertreet commented 3 days ago

Plus one,me too