Discussion on model hyperparameter optimization (HPO) with PyPOTS and NNI

Justin0388 commented 4 months ago

Issue description

We can share experiences and discuss the problems encountered during HPO in this issue.

WenjieDu commented 4 months ago

Hi there 👋,

Thank you so much for your attention to PyPOTS! You can follow me on GitHub to receive the latest news of PyPOTS. If you find PyPOTS helpful to your work, please star⭐️ this repository. Your star is your recognition, which can help more people notice PyPOTS and grow PyPOTS community. It matters and is definitely a kind of contribution to the community.

I have received your message and will respond ASAP. Thank you for your patience! 😃

Best, Wenjie

Justin0388 commented 4 months ago

A simple tutorial of Hyperparameter Optimization（HPO）in PyPOTS

Environment

• OS: Windows 10 • IDE: Pycharm 2023.2 community • Python version: 3.9.13 • Torch version: 2.3.0+cu121 • NNI version: 3.0 • Training service: local

Based on the GitHub repo Awesome_Imputation, run the SAITS Model HPO Experiments with the _airquality dataset as a tutorial：

Step1. Data preparation（Downloads and preprocessing）

Run the python scripts to download and preprocess the datasets. In Awesome_Imputation, the path to the script is as follows: _time_series_imputation_survey_code/data_processing/gene_airquality.py

If the script runs successfully, you can find the data files in the directory of : _time_series_imputation_survey_code/data_processing/data_processing/data 路径图片

Step2. NNI Configration（Search space and Experiments）

• Search space configuration:

The search space configuration file is in the directory of _time_series_imputation_survey_code/PyPOTS_tuning_configs/SAITS/SAITS_AQI_tuningspace.json. In this file, you need to configure the hyperparameters to be optimized and the corresponding search space. You can also refer to the NNI documentation to modify it according to your needs.

{
    "n_steps": {"_type":"choice","_value":[24]},
    "n_features":  {"_type":"choice","_value":[132]},
    "epochs":  {"_type":"choice","_value":[200]},
    "patience":  {"_type":"choice","_value":[10]},
    "n_layers":  {"_type":"choice","_value":[1,2,3]},
    "d_model":  {"_type":"choice","_value":[64]},
    "d_ffn":  {"_type":"choice","_value":[64, 128, 256, 512, 1024]},
    "n_heads":  {"_type":"choice","_value":[1,2,4,8]},
    "d_k":  {"_type":"choice","_value":[32, 64, 128, 256]},
    "d_v":  {"_type":"choice","_value":[32, 64, 128, 256]},
    "dropout":  {"_type":"choice","_value":[0,0.1,0.2,0.3,0.4,0.5]},
    "attn_dropout":  {"_type":"choice","_value":[0,0.1,0.2,0.3,0.4,0.5]},
    "lr":{"_type":"loguniform","_value":[0.00005,0.01]}
}

• Experiments configuration:

The experiments configuration file is in the directory of _time_series_imputation_survey_code/PyPOTS_tuning_configs/SAITS/SAITS_searchingconfig.yml. In this file, you need to configure the experiments information and devices used in the experiments. You can also refer to the NNI documentation to modify it according to your needs.

experimentName: SAITS hyper-param searching
authorName: WenjieDu
trialConcurrency: 1
trainingServicePlatform: local
searchSpacePath: SAITS_AQI_tuning_space.json
multiThread: true
useAnnotation: false
tuner:
    builtinTunerName: Random

trial:
    command: set enable_tuning=1 && pypots-cli tuning --model pypots.imputation.SAITS --train_set ..\..\data\air_quality\train.h5 --val_set ..\..\data\air_quality\val.h5
    codeDir: .
    gpuNum: 1

localConfig:
    useActiveGpu: true
    maxTrialNumPerGpu: 1
    gpuIndices: 0

It's worth noting that, in the command line, we need to use “set enable_tuning=1 &&” in Windows OS to set the enable_tuning parameter.

Step3. Run Experiments（Run and Monitoring）

You need to change the path to the .yml file in the terminal and run the experiments with nni tools.

cd SAITS && nnictl create -c SAITS_searching_config.yml --port 8080

运行图片

And then, you can click the URL and monitor the trials in the Web UI. Web界面1 Web界面2 Web界面3

Some Tips.

1.How to view the status of trials and log files?

You can click on the trials in the Web UI to monitor the status of trials. If the status is ”Failed”，you could read the error message in the log file to debug. 查看log文件

2.How to kill ports in Windows OS?

If you have stopped an experiment using the command 'nni stop' and run another experiment with the same port, you may encounter the error message 'Port 8080 is not idle.' This means you may need to use a command in CMD to kill the port. You can search for the PID of the port with the following command:

cmd input : netstat -aon|findstr "8080"

You can kill the corresponding PID with the following command:

taskkill /T /F /PID

Detailed information can be found at this link

3.How to stop and resume an experiment?

You can use the 'nni stop' command to stop an experiment, and its status will change to 'STOPPED' in the Web UI. Only experiments with a status of 'DONE' will have access to trial information through the Web UI after they have been stopped. Therefore, if you want to restart an experiment with a 'STOPPED' status, you can use the 'nni resume' command. Detailed information can be found at this link

We can exchange more experiences and questions to learn from each other!

WenjieDu / PyPOTS

Discussion on model hyperparameter optimization (HPO) with PyPOTS and NNI #408

Issue description

A simple tutorial of Hyperparameter Optimization（HPO）in PyPOTS