EIDOSLAB / simplify

Simplification of pruned models for accelerated inference | SoftwareX https://doi.org/10.1016/j.softx.2021.100907
https://doi.org/10.1016/j.softx.2021.100907
BSD 3-Clause "New" or "Revised" License
35 stars 3 forks source link

Big accuracy drop after simplify #12

Open AndRayt opened 1 year ago

AndRayt commented 1 year ago

Hi guys, congrats for the work, nice library.

I am trying to use Simplify with ResNet on CIFAR-100 dataset. I use prune.ln_structured method from Pytorch with amount value is 0.8 and then apply simplify and do fine-tune of the model. As a result I have got the inference time speed up by two times. But I have big accuracy drop: 75% accuracy before prune.ln_structured + simplify + fine-tune and 65% after prune.ln_structured + simplify + fine-tune. Is it expected result? Have you checked accuracy before and after simplify in your cases?

AndreaBrg commented 1 year ago

@AndRayt Hi, thanks for your interest. Could you provide a sample code to reproduce the issue?

AndRayt commented 1 year ago

@AndreaBrg Hi,thank you for your answer.  I  attached an ipynb file with code.  Here I am training the ResNet-50 model from scratch for 15 epochs and the accuracy is 36% (to make it easier for you to reproduce. Increasing the number of epochs improves accuracy up to 98%). Then I apply the prune.ln_structured method with amount=0.8 and simplify this model. As a result I have good inference time speed up and accuracy is only 1%. Then I fine-tune the model for 8 epochs (half of the number of epochs that was spent on training before) and get accuracy is only 25.7%. It is clear that in this case we could continue to train until we get 36% accuracy again, as after 15 epochs of training from scratch. But if, for example, we take ResNet-18 also on CIFAR-100, then we have an accuracy of 75% after 300 epochs and fine-tune during even the same 300 epochs does not help to achieve 75% (or even 72%) accuracy again. Instead, I have an accuracy of 10% lower after fine-tune.  Please help me to understand this issue. I hope I have described the problem with accuracy drop in sufficient detail.

https://colab.research.google.com/drive/1uuUxNEuNv9yG46eVQdviSY3ud5ETtY7r?usp=sharing

carloalbertobarbano commented 1 year ago

Hi! Did you compare the accuracy of the pruned model before and after applying simplify?

AndRayt commented 1 year ago

Hi! Do you mean that this is a problem of the torch.prune.ln_structured method as one of the ways to zero the weights of the model? As I know, structured pruning is performed in two stages: 1.Zeroing the weights of the model, 2. Trimming the previously zeroed weights.  And therefore, in my opinion, it is necessary to evaluate the accuracy of the model, exactly like the inference time, before zeroing the weights + trimming weights and after that.  Or is the simplify library designed exclusively for the inference time speed up? Can you suggest another effective way to zero the weights instead of the prune.ln_structured method before applying simplify in this case? Because an accuracy drop of more than 2-3% is an unsatisfactory  result.

carloalbertobarbano commented 1 year ago

Yes, the aim of Simplify is only to reduce the running time for inference on a pruned model. Simplify should not alter accuracy whatsoever, so it is weird if that happens. Could you check if the accuracy drop is given by torch.prune.ln_structured or simplify? If the cause is ln_structured, you could maybe try with a more advanced pruning scheme

AndRayt commented 1 year ago

Hi! Yes, we have a significant accuracy drop after ln_structured method due to the zeroing of big part (0.8) of the weights, but we also have an accuracy drop after Simplify, because we remove the zeroed weights and leave only part of the model weights, so the accuracy drops to almost zero. May be do you know any ways for zeroing weights is better than ln_structured? Your library helped a lot to improve the inference time and I would like to solve the issue now with zeroing the weights (what to use instead of ln_structured?) and get a good final result. Unfortunately, current accuracy drop is very big.

carloalbertobarbano commented 1 year ago

Sorry for the delay. It seems then that there might be a bug with the current Simplify implementation. Under no circumstances the simplify procedure should change the model output. We need to investigate this deeper, thanks for the issue.

carloalbertobarbano commented 1 year ago

What version of torch, torchvision and simplify are you using?