Closed eldarkurtic closed 2 years ago
Hi,
Sure.
In the interval [begin_pruning_step
, end_pruning_step
] defines when the scheduler allows the pruning masks to update and change the pruning pattern. Outside of this interval the masks will remain constant and the pruning pattern will remain the same regardless of the weights' magnitudes.
In the interval [policy_begin_step
, policy_end_step
] defines the interval of the pruning policy. The pruning policy defines how we increase the sparsity during training from the initial sparsity to the final sparsity in the assigned interval. For example, in this library we strickly use the policy that was introduced in To prune, or not to prune: exploring the efficacy of pruning for model compression:
where
If we do a run with 100k steps and we specify the following pruning config:
we would get the following:
But I'm not sure what would happen with model and its sparsity in [50k, 80k] and [80k, 100k] ranges?
Since pruning policy finished at step=50k and at that point our model has the final sparsity mask, why do we need the end_pruning_step
at 80k?
In the interval [50k, 80k] the sparsity ratio of the model have reached its final value, however the sparsity masks will continue to update every pruning_frequency
steps, changing the sparsity pattern of the model according to the highest magnitude weights currently in the model.
Is this part described somewhere in the paper (just checking if I've missed it)? If not, could you please clarify a bit more how the sparsity mask changes in the [50k, 80k] range?
This is not described in the paper, however, this is common practice in magnitude pruning and I think it is described in To prune, or not to prune: exploring the efficacy of pruning for model compression which we refer to in our paper.
Okay, thanks a lot for clarification :)
Hi, Could you please clarify the difference between
end_pruning_step
andpolicy_end_step
in the pruning config file (for example: https://github.com/IntelLabs/Model-Compression-Research-Package/blob/main/examples/transformers/language-modeling/config/iterative_unstructured_magnitude_90_config.json)?