Closed huangjt111 closed 11 months ago
There are two ways to control the computation. One is to lower the epoch number accordingly. Seems you are referring to the CIFAR100 R-18 experiment. If a default ratio is 35%, using 154 epochs gives the corresponding result. Your implementation probably has some problems affecting the annealing if not controlling the "total step" with care. Another way is to use percentile methods.
InfoBatch targets practical applications so we didn't study too much on extreme ratios. With 80% pruning ratio it still gives a reasonable performance. We suppose several epochs for annealing also make sense, although InfoBatch currently is not designed for extremely low keeping ratio.
With your new hyperparameters on your updated paper, I successfully reproduce the result 78.2%. But I still have two problems. First, your code don't have any operation about controlling pruning ratio. The default pruning rate is approximately 35%. When I control the number of iterations to 39100(half of whole dataset training),I fail to reproduce your result and the acc1 is 75.30. I have changed the total step of OnecycleLR. Did you make any other changes when adjusting the pruning ratio? Second, how do you design your training when the pruning ratio is extreme low? For example, at 90% pruning ratio, how much epochs do you train on the full dataset at the end of training?