Closed kumasento closed 5 years ago
Hi, actually this depends on the dataset and model configurations. However our experiments shows 1e-5
works fairly well for many configurations.
Thanks!
hi, when apply group lasso to Condensenet-light-94 on cifar100, the loss suddenly increase during training, then I check the weight, I found some values turned into nan。 but I don't exactly know why this happened? something wrong with my code? Inappropriate hyperparameters,or group lasso makes training unstable?
@ShichenLiu Hi, any update on this question? The paper mentioned the 1e-5 for imagenet dataset, however not mentioned the value for cifar dataset. I tried to compare on the following models
# cifar100
# condensenet-86-no_grouplasso: 23.44
# condensenet-86-grouplasso: 24.14
The one without group lasso seems to perform better than the one with group lasso Did you add the group lasso term for cifar dataset in the paper?
Thanks in advance
Hi, this question might be trivial but what is the exact value for
group_lasso_lambda
that you are using? Is it1e-5
according to the paper? Thanks!