Training Time issue when training Condensenet-light on cifar100

ShichenLiu / CondenseNet

CondenseNet: Light weighted CNN for mobile devices

MIT License

694 stars 131 forks source link

Training Time issue when training Condensenet-light on cifar100 #21

Closed infrontofme closed 5 years ago

infrontofme commented 5 years ago

Hi, I am reproducing your work in tensorflow, but I found that dropping during training has taken a lot of time. I would like to ask if you have encountered such a problem. What do you think might be the reason?

ShichenLiu commented 5 years ago

Hi,

In my implementation, I multiply the weight tensor with a binary mask, which should not significantly affect the training speed. Could you provide some description about how you implement the dropping operator? Thanks!

infrontofme commented 5 years ago

Thank you for your reply.

I also multiply the weight tensor with a binary mask. I found there are some differences in Calculation Graph between tensorflow and pytorch. The graph of tensorflow is static. When I use the loop in my code, it will cause the calculation graph to become larger and the calculation to be slower. Then, I fixed my code, and the problem was solved.

your work is awesome！

ShichenLiu commented 5 years ago

Yes, do not use for loop during training in any deep learning framework. That will cause frequently starting and stoping cuda kernel, which will largely slow down the speed. Glad to hear that the problem has been fixed!