Closed roboserg closed 6 years ago
Hi @roboserg thanks for submitting the issue, @sandeep-krishnamurthy requesting this be labeled.
Hi @roboserg
Thanks for testing it out. Yes you are right, on smaller dataset, on only 1 GPU, MX backend is slightly slower or similar performance to TF backend. This is also consistent in this Benchmark result - https://github.com/awslabs/keras-apache-mxnet/tree/master/benchmark#cnn-benchmarks
I ran a small experiment with MXNet and TF backend for MNIST_CNN on P2.X machine (1 NVIDIA K80 GPU). MXNet backend takes around 9 seconds
per epoch and TF backend takes around 8 seconds
per epoch. (Using channels_first format for both backends). So MX backend with 1 GPU should be slightly slower/similar performance as TF, but, not too far. Can you please confirm you are using GPU, image format is channels_first.
You can see significant speed up with MXNet backend on larger images and multiple-GPUs as represented in these benchmarks as well - https://github.com/awslabs/keras-apache-mxnet/tree/master/benchmark#cnn-benchmarks
I updated / rewrote my first post and did 5 runs for cifar10 (mxnet and TF each).
@sandeep-krishnamurthy
I can confirm I am using a GPU as I monitor the GPU load during training. For mxnet additionally I force the GPU with context=["gpu(0)"] in model.fit()
Image format is channels_first in both cases, mnist x_train shape: (60000, 1, 28, 28), cifar10 x_train shape: (50000, 3, 32, 32)
I will try bigger images with VGG net or Inception and will report the results.
@roboserg - Thank you.
You can also use Benchmark utility we have to test bigger RESNET network - https://github.com/awslabs/keras-apache-mxnet/tree/master/benchmark
Thanks for diving into the issue @roboserg
As mentioned by @sandeep-krishnamurthy the performance results you see are consistent with the benchmarking reports.
Closing this issue for now, feel free to re-open to report more stats on the performance.
I did 5 runs with "channels_first" for both mxnet and "normal" keras with TF on MNIST and CIFAR examples from this repo. For both backends I used "channels_first ". My results are the following:
MNIST:
MXnet: 1min 19s ± 1.17 s per loop (mean ± std. dev. of 5 runs, 1 loop each) TF: 1min ± 1.84 s per loop (mean ± std. dev. of 5 runs, 1 loop each) mxnet 24% slower
CIFAR-10
MXnet: 47 s ± 643 ms per loop (mean ± std. dev. of 5 runs, 1 loop each) TF: 56.8 s ± 527 ms per loop (mean ± std. dev. of 5 runs, 1 loop each) mxnet 17% faster
It is weird, since mxnet supposed to be 50%+ faster then TF. My specs are: CPU i7 6700K, GPU 1070GTX, 16GB RAM. Keras and mxnet 2.1.6. Windows 10 64 bit.
Why is mxnet slower on MNIST?
Code: