facebookresearch / kill-the-bits

Code for: "And the bit goes down: Revisiting the quantization of neural networks"
Other
636 stars 124 forks source link

About small datasets training #6

Closed wnma3mz closed 5 years ago

wnma3mz commented 5 years ago

This is a great project. I am a newcomer to vector quantization. I have encountered such a problem with this project.

Since I just want to test the methods in the paper, I am using the CIFAR10 dataset for testing here. The model selects resnet18 and the dataset uses CIFAR10. After fine-tuning the model, save the model as resnet18-cifar10.pth. Finally the run command is just like what is said in the README.md.

python quantize.py --model resnet18 --block-size-cv 9 --block-size-pw 4 --n-centroids-cv 256 --n-centroids-pw 256 --n-centroids-fc 2048 -- Data-path ../cifar10

Although the program has not finished running, the log during the process tells me that the result is not good. For resnet18-cifar10.pth, the accuracy rate can reach about 80%. But in the log this accuracy is very low, no more than 10. As follows,

Quantizing time: 3min, Top1 after quantization: 8.51

Here I changed the path of the data-path to the path of cifar10, and changed the model load in quantize.py to resnet18-cifar10.pth. The data loading in data/dataloader.py was also changed to CIFAR10.

In summary, I don't know which step I have a problem with? Or I ignored some aspect of the paper and looked forward to your reply. Thank you very much for your work and I have benefited a lot.

P.S. Related files have been uploaded to my github.

  1. Data set
  2. Trained model
  3. CIFAR10 training code
  4. Verify resnet18-cifar10.pth
  5. quantize.py
pierrestock commented 5 years ago

Hey wnma3mz and thanks for your interest in the project!

Could you check the two following things:

Also, one other possibility would be to start from the quantized resnet18 on ImageNet and to finetune the centroids of the classifier directly.

wnma3mz commented 5 years ago

Thank you for your reply!

For another possibility, I will try it later.

wnma3mz commented 5 years ago

Thank you very much for your reply, I have now found the bug in my code.

The default number of classifications for the ResNet network is 1000, but the number of labels for the CIFAR10 is 10. I should retrain the network instead of using pretrained=True. After the modification, I also adjusted the same pre-training model for the teacher in quantize.py.

This is a very awesome project, thank you again.

quocnhat commented 5 years ago

Hi @wnma3mz, what is the score of your quantized model? Is it still good compare to the original model? I've followed your suggestion but the score i got is much smaller than the model before quantization ( ~10 compare to 70)

wnma3mz commented 5 years ago

Hi @wnma3mz, what is the score of your quantized model? Is it still good compare to the original model? I've followed your suggestion but the score i got is much smaller than the model before quantization ( ~10 compare to 70)

Thanks for your attention. Yes, compared to the original model, the accuracy dropped by about 4%. How did you experiment? Can you provide some code to help us figure out the problem?

quocnhat commented 5 years ago

thanks @wnma3mz for your reply. Any suggests would help me a lot. I followed the same steps that your mentioned above., ie.:

Screenshot from 2019-08-22 16-39-27

wnma3mz commented 5 years ago

I think your guess is reasonable. How do you load your teacher model. My approach is to load the same trained model, which is the same as the student.

quocnhat notifications@github.com 于2019年8月22日周四 下午5:46写道:

thanks @wnma3mz https://github.com/wnma3mz for your reply. Any suggests would help me a lot. I followed the same steps that your mentioned above., ie.:

  • train and save the student model "resnet18-cifar10.pth". it reaches ~ 70 Acc after 100 epochs with setting pretrained = False by running your "fine-tuning_model.py".
  • quantize.py with the same config as you mentioned. Here, I see that the score after 1st layer quantization is still good but it dropped alot after finetuning centroids. I guess it is because of fine tune's parameter.

[image: Screenshot from 2019-08-22 16-39-27] https://user-images.githubusercontent.com/19263564/63504320-7c22c680-c4fb-11e9-9cfa-1b1e1607e10f.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/kill-the-bits/issues/6?email_source=notifications&email_token=AFPPQQF7Z77INQNRGC3PGTDQFZN67A5CNFSM4IIM5NT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44Q2NY#issuecomment-523832631, or mute the thread https://github.com/notifications/unsubscribe-auth/AFPPQQHBFIZCUN6HCAICGDDQFZN67ANCNFSM4IIM5NTQ .

quocnhat commented 5 years ago

thank you for your fast reply. I will try it.