tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 319 forks source link

Non-uniform sparsity layer-wise with PolynomialDecay scheduler #987

Open AmoghDabholkar opened 2 years ago

AmoghDabholkar commented 2 years ago

System information

Motivation

Instead of assuming that the initial and final sparsity are uniform across layers, is it possible to add a feature where the user can feed in either a custom sparsity map or a sparsity distribution generated using ERK (like in @evcu 's Rigl codebase - https://github.com/google-research/rigl/blob/master/rigl/sparse_utils.py) @evcu 's experiments have shown ERK to work better and in general polynomial scheduler also seems to work better, so incorporating that with the tfmot call would be very helpful.

chococigar commented 2 years ago

Hi AmoghDabholkar@, thanks for your input! We haven't considered this feature yet in tfmot sparsity, but will consider this to be included in our next batch of updates.

Thanks!

evcu commented 2 years ago

I agree it would be nice to support this. I've implemented ERK in a hacky way in one of our recent projects. The tricky thing is the layer parameter shapes required before the calculation, thus often requires initialization and key matching. It would be much easier if an initialized model is wrapped after the fact.

https://github.com/google-research/rigl/blob/0f029735f84e0120df05512244510e8ed48a4461/rigl/rl/dqn_agents.py#L163