How to set epsilon param of dhspg?

This is good question. For theorem, we required epsilon in [0, 1). The value difference is due to the HSPG arxiv paper on 2020's implementation has a bit discrepancy. Please use the up-to-date version, i.e., this repo. Meanwhile, we typically use epsilon as 0.9, 0.95 for all our experiments.

I think you raised the question might be during applying DHSPG onto your model, the group sparsity does not produce as expected. If so, to mitigate the issue, please keep the below in mind.

Start pruning at initialization stage. You started pruning at 100 epochs. I am not sure how long your total training last for. During our varying practice, we start pruning right after a few number of epochs for warmup.
Adjust other hyper-parameter if needed. Your initial learning rate is 1e-2. In hyperparameters, we provided default settings for group sparsity exploration upon varying optimizers. Please increase lmbda, lmbda_amplify and hat_lmbda_coeff 10 times larger if you feel group sparsity does not produce that well.

All such hyperparamters can be set up in the

optimizer = oto.dhspg(
***
lmbda=,
lmbda_amplify=,
hat_lmbda_coeff=,
)

Hope the above help.

tianyic / only_train_once_personal_footprint

How to set epsilon param of dhspg? #27