microsoft / LQ-Nets

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
MIT License
239 stars 70 forks source link

about tf version #15

Closed cool-ic closed 5 years ago

cool-ic commented 5 years ago

hi! Thanks for your code. Can you help provide the cuda/cudnn/trensorflow/tensorpack version that your guys used for your latest code?I got OOM with alxenet/batchsize=8/4 1080ti, which I think is caused by the wrong software version.

EowinYe commented 5 years ago

Our experiments were tested in TF-1.3.0 with old tensorpack. If you'd like to test it in the latest TF and tensorpack, you can try our new branch support-latest-tf-tensorpack. BTW, LQ-nets need a little bit more GPU memory in the training. But I think it's sufficient if you try AlexNet with 4 batches per GPU.

cool-ic commented 5 years ago

Thanks for your reply. Now I got OOM when I try to quant both the weight and activation to 8bit with alexnet and 1 batch pre GPU.It happens both on the " support-latest-tf-tensorpack" version and old version

Does it mean that quant both weight and activation to 8bit need too much memory that 1080ti cant handle it? And if not, what is the possibly reason for it?

EowinYe commented 5 years ago

Actually we didn't quantize both weights and activations to 8 bits in our experiments. Since tensorflow needs a lot of memories when executing matrix operators and there are only 11G memories in 1080ti, I'm not sure whether 2080ti can handle it. But we quantized them to 2 bits with AlexNet and 64 batches per GPU, It should be fine if you try 8bits with 1 batch per GPU. I recommend you try it in 2bits at first.

cool-ic commented 5 years ago

Sorry to trouble you again. We fix the OOM problem by use a better GPU, but now we are sill unable to run the lqnets because of "Input is not invertible" in "learned_quantization.py".

It seems that during the training , the BTxB will be un-invertible. Do you have any idea about how to fix it?

EowinYe commented 5 years ago

The input matrix will become invertible when the batch size is small and the quantization bit is large. If you want to continue training in this condition, you could increase the EPS in learned_quantization.py, which would smooth the training of quantizer.