tensorflow / benchmarks

A benchmark framework for Tensorflow
Apache License 2.0
1.14k stars 632 forks source link

Keras+Tensorflow Benchmark on Synthetic LSTM Dataset #157

Open karan6181 opened 6 years ago

karan6181 commented 6 years ago

Hi,

I am running the lstm_benchmark.py test on CPU and multi GPU device(Amazon EC2) and I am not getting scaling as expected. Below are the pieces of information:

Instance: P3.8xLarge(Amazon AWS) contains 4 GPUs

Virtual Env: TensorFlow(+Keras2) with Python2 (CUDA 9.0, V9.0.176)( source activate tensorflow_p27)

Python version: 2.7.14

Tensorflow version: 1.5.0

Keras version: 2.1.4

Deep Learning AMI: Amazon Linux

Modifications:

run_tf_backend.sh: Changed models='resnet50_eager' to models=‘lstm’

models/lstm_benchmark.py: changed self.num_samples = 1000 to self.num_samples = 50000

Command ran:

$ sh run_tf_backend.sh cpu_config
$ sh run_tf_backend.sh gpu_config
$ sh run_tf_backend.sh multi_gpu_config

Results:

Instance GPUs Backend Batch size Data Set Training Method Speed/Epoch (Lower is better) Unroll Type No. of samples Memory(MiB)
p3.8xLarge 0 Tensorflow 128 Synthetic fit() 18sec - 363us/step unroll=False 50000 0
p3.8xLarge 1 Tensorflow 128 Synthetic fit() 18sec - 362us/step unroll=False 50000 15360
p3.8xLarge 4 Tensorflow 128 Synthetic fit() 33sec - 651us/step unroll=False 50000 15410

The test doesn’t scale while using GPUs, means, speed/Epoch should be lower approx. by a factor of n where n is the number of GPUs.

Is this an expected behavior? Or Am I missing something here?

Thank-You!

reedwm commented 6 years ago

/CC @anj-s