It is a result of the densenet121 hyperparameter tuning with our advanced options.
The thing I considered in this experiment
Using advanced options: ASL, random augmentation It was not possible to use the label smoothing technique because this experiment started before fixing the conflict between ASL and the label smoothing
As many trials as possible I tried 30 trials and it can not guarantee the tuning is optimal but still, this experiment took almost a week even I tried this with only the half of CheXpert dataset
Enough epochs Empirically, I observed improvement in best score after 15 epochs. Thus, I tried 20 epochs but hard to to more due to the time limit
The performance is not sensitive to lr unless lr is too high. Empirically, higher than 1e-2 could be too high, and lower than 1e-3 seems adequate
The weight_decay and batch_size look not a good thing to tune. The range of weight_decay is too wide (be cautious because the weight_decay axis is on the log scale) in the top 9. The batch_size, also, does not show distinct patterns
The results of ASL factors are in line with the intuition
The range of gamma_neg of the top 9 is from around 2.2 to 4.6. It looks neither too low nor too high. Probably, 3-3.5 could be adequate for fixed gamma_neg. Higher than 4 seems not good because some low rank trials are with gamma_neg from 4 to 4.5
The ps_factors (probability shifting factor) of the top 9 are from around 0.05 to 0.18. It is great that range of the top 9 is not out over 0.2 which could be suspiciously high. However, many low-rank trials are around 0.12. Thus, it is not clearer than asl_gamma_neg. If there are not enough resources to tune, removing ps_factor from the search space might be better
Hard to find a good combination of Random Augmentation's magnitude & # of operations, but at least avoiding too strong augmentation could be wise
Why
To figure out moderate hyperparameters and get some hints for future experiments.
What
It is a result of the densenet121 hyperparameter tuning with our advanced options.
The thing I considered in this experiment
It was not possible to use the label smoothing technique because this experiment started before fixing the conflict between ASL and the label smoothing
I tried 30 trials and it can not guarantee the tuning is optimal but still, this experiment took almost a week even I tried this with only the half of CheXpert dataset
Empirically, I observed improvement in best score after 15 epochs. Thus, I tried 20 epochs but hard to to more due to the time limit
(Imagenet pretrained)
Experiment result Link : https://wandb.ai/snuh_interns/kdg_tune_densenet121/groups/trainval_2023-01-27_08-17-57/workspace?workspace=user-snuh_interns
$$A\ highlightened\ top\ 9\ trials\ by\ parallel\ coordinates\ plot$$
$$A\ highlightened\ bottom\ 10\ trials\ by\ parallel\ coordinates\ plot$$
Key observations
Why
To figure out moderate hyperparameters and get some hints for future experiments.
How