Open a-agrz opened 6 years ago
Are you using mini-batches during training ? GPUEnabler works best when it can contain the entire dataset in GPU memory and cacheGpu is used between transformations just to keep the data movement between GPU & CPU to the minimal. Data movement are quite costly and it can easily overtake the computational gain we get from running on GPU.
No all data is first cached in GPU because all the data can be fit in GPU memory of 7 Gb and cacheGpu is used between transformations just like the example GpuKMeans.scala
Best regards
aguerzaa
On Wed, Feb 21, 2018 at 8:18 AM, Josiah Samuel notifications@github.com wrote:
Are you using mini-batches during training ? GPUEnabler works best when it can contain the entire dataset in GPU memory and cacheGpu is used between transformations just to keep the data movement between GPU & CPU to the minimal. Data movement are quite costly and it can easily overtake the computational gain we get from running on GPU.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMSparkGPU/GPUEnabler/issues/78#issuecomment-367234799, or mute the thread https://github.com/notifications/unsubscribe-auth/AQRvaHqT7fXDvtbotaZIfIuR_SWuLNVMks5tW8NEgaJpZM4SLl0h .
--
Hello!
I used the example of GpuKMeans to write a KMeans application over MNIST dataset that contains 60000 images, on the other hand, I used org.apache.spark.ml.clustering API over the same dataset in order to compare performance...
this is what I got as results for 20 iterations: CPU: 16.54 s GPU: 41,1 s It's *2.48 slowdown !!!**
So how can I make KMeans application run faster on GPU ?
Ps: those results are obtained on spark-shell fired over 40 cores and M60 GPU
Best regards
aguerzaa