BichenWuUCB / squeezeDet

A tensorflow implementation for SqueezeDet, a convolutional neural network for object detection.
BSD 2-Clause "Simplified" License
739 stars 306 forks source link

low GPU usage #121

Open cygerts opened 5 years ago

cygerts commented 5 years ago

I noticed very low GPU usage when running the code. In nvidia-smi it is 0% with sudden jumps to ~90%. When I running the training on average CPU it is around 3 images / second. When I run it on powerful Tesla V100 it is only 3 times faster so definitely sth can be improved.

Batch size is 20 as default, I have changed it to 128 for Tesla but didnt improve the speed. Also disabling Data augmentation didnt make any improvement. Training on Detrac dataset, img_size = 480x270.

Any suggestions what might be the reason for small speed up of training on GPU card in this setting?