SimJeg / FC-DenseNet

Fully Convolutional DenseNets for semantic segmentation.
486 stars 143 forks source link

Comparing training time as sanity check #15

Closed lewfish closed 7 years ago

lewfish commented 7 years ago

I implemented this model in Keras/Tensorflow and am finding that training is slower than I expected. (It's about 8 times slower per epoch than a simplified version of U-Net). This might be due to a greater number of sequential operations in the FC-DenseNet which can't be parallelized on the GPU.

As a sanity check, I wanted to compare the time per epoch that you reported with mine. In the README, it says that it took 120 secs per epoch. How many samples are there per epoch? I know there are 3 samples per minibatch, so how many minibatches are there per epoch? This number seems to come from iter.get_n_batches() (https://github.com/SimJeg/FC-DenseNet/blob/master/train.py#L24) which isn't in the repo. I'm training on a Tesla K80 using 256x256 images with 4096 samples per epoch with a batch size of 2 (ie. 2048 minibatches), and it's taking 45 mins per epoch.

Thanks!

SimJeg commented 7 years ago

Hi, yes the classical U Net is faster to train :) We used a batch size of 3 with <400 images in the training set. So 2min for <400 images on a Titan X with crops (224,224) and 45min for 4000 images on a K80 with crops (256,256) sounds quite normal for me. Consider doing the training on multiple GPUs if you can !

ldenoue commented 7 years ago

@lewfish do you have your tensorflow port published somewhere? I'm interested in testing the speed on my images (dancers, not roads). Thanks.

lewfish commented 7 years ago

Here is my port to Keras/Tensorflow: https://github.com/azavea/raster-vision/blob/develop/src/rastervision/semseg/models/fc_densenet.py