Oracen-zz / MIDAS

Multiple imputation utilising denoising autoencoder for approximate Bayesian inference
Apache License 2.0
119 stars 28 forks source link

GPU utilization in AWS #9

Closed MarKo9 closed 5 years ago

MarKo9 commented 6 years ago

Hi,

Once again thanks for the effort. Using the previous version of the library on AWS (p2.8xlarge) on a ~250GB dataset and although it seems that all GPUs are available Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:17.0, compute capability: 3.7) 2018-10-17 06:30:29.402721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla K80, pci bus id: 0000:00:18.0, compute capability: 3.7) 2018-10-17 06:30:29.402734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla K80, pci bus id: 0000:00:19.0, compute capability: 3.7) 2018-10-17 06:30:29.402745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: Tesla K80, pci bus id: 0000:00:1a.0, compute capability: 3.7) 2018-10-17 06:30:29.402757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:4) -> (device: 4, name: Tesla K80, pci bus id: 0000:00:1b.0, compute capability: 3.7) 2018-10-17 06:30:29.402768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:5) -> (device: 5, name: Tesla K80, pci bus id: 0000:00:1c.0, compute capability: 3.7) 2018-10-17 06:30:29.402779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:6) -> (device: 6, name: Tesla K80, pci bus id: 0000:00:1d.0, compute capability: 3.7) 2018-10-17 06:30:29.402801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:7) -> (device: 7, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)

when checking the utilization only one GPU is utilized

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 384.81 Driver Version: 384.81 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K80 On | 00000000:00:17.0 Off | 0 | | N/A 77C P0 85W / 149W | 10931MiB / 11439MiB | 60% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K80 On | 00000000:00:18.0 Off | 0 | | N/A 54C P0 69W / 149W | 10877MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla K80 On | 00000000:00:19.0 Off | 0 | | N/A 78C P0 60W / 149W | 10877MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla K80 On | 00000000:00:1A.0 Off | 0 | | N/A 57C P0 70W / 149W | 10875MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 4 Tesla K80 On | 00000000:00:1B.0 Off | 0 | | N/A 74C P0 61W / 149W | 10875MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 5 Tesla K80 On | 00000000:00:1C.0 Off | 0 | | N/A 56C P0 70W / 149W | 10875MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 6 Tesla K80 On | 00000000:00:1D.0 Off | 0 | | N/A 77C P0 62W / 149W | 10873MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 7 Tesla K80 On | 00000000:00:1E.0 Off | 0 | | N/A 59C P0 70W / 149W | 10871MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2244 C /home/ubuntu/src/anaconda3/bin/python 10912MiB | | 1 2244 C /home/ubuntu/src/anaconda3/bin/python 10858MiB | | 2 2244 C /home/ubuntu/src/anaconda3/bin/python 10858MiB | | 3 2244 C /home/ubuntu/src/anaconda3/bin/python 10856MiB | | 4 2244 C /home/ubuntu/src/anaconda3/bin/python 10856MiB | | 5 2244 C /home/ubuntu/src/anaconda3/bin/python 10856MiB | | 6 2244 C /home/ubuntu/src/anaconda3/bin/python 10854MiB | | 7 2244 C /home/ubuntu/src/anaconda3/bin/python 10854MiB | +-----------------------------------------------------------------------------+

Is the library designed so as to utilize all GPUs available in the system by default?

main library versions: tensorflow 1.4.0rc0 numpy 1.13.3 pandas 0.20.3 py36h6022372_2
Cuda compilation tools, release 9.0, V9.0.176

Thanks in advance.

cmlakhan commented 5 years ago

@MarKo9 Did you figure this out? I am having the same issue.

ranjitlall commented 5 years ago

Any thoughts about this? Did you get my previous message btw?

Ranjit

From: MarKo9 notifications@github.com Reply-To: Oracen/MIDAS reply@reply.github.com Date: Saturday, May 4, 2019 at 3:27 AM To: Oracen/MIDAS MIDAS@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [Oracen/MIDAS] GPU utilization in AWS (#9)

My problem was that I was using a huge dataset (42m x 600) and it took forever to impute. However the problem with the library more than the gpu utilization (which I did not solve to answer your question) is that it does a lot of calculations in one core and not vectorized and this is where the big problems with speed comes from.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Oracen/MIDAS/issues/9#issuecomment-489286441, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AINSGR5XRMV3TIS4VGXDTSLPTTX7RANCNFSM4F635JVQ.

ranjitlall commented 5 years ago

I didn't get the previous message. Will look tomorrow evening, got two meetings with VC and CEO tomorrow.

Of course it's not using more than one core, that would require custom architecture-dependent code. He's throwing P3 nodes at code designed for consumer GPUs. The guy's a numpty and if he's so concerned he should code his own infrastructure.

MIDAS is still in limbo while I work on company code. We're close to seed funding, and as mentioned previously; until money is coming in again, all effort is directed to securing income.

Apologies,

Alex

On Mon, May 6, 2019 at 6:57 PM Ranjit Lall ranjitlall@hotmail.com wrote:

Any thoughts about this? Did you get my previous message btw?

Ranjit

From: MarKo9 notifications@github.com Reply-To: Oracen/MIDAS < reply@reply.github.com> Date: Saturday, May 4, 2019 at 3:27 AM To: Oracen/MIDAS MIDAS@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [Oracen/MIDAS] GPU utilization in AWS (#9)

My problem was that I was using a huge dataset (42m x 600) and it took forever to impute. However the problem with the library more than the gpu utilization (which I did not solve to answer your question) is that it does a lot of calculations in one core and not vectorized and this is where the big problems with speed comes from.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Oracen/MIDAS/issues/9#issuecomment-489286441, or mute the thread https://github.com/notifications/unsubscribe-auth/AINSGR5XRMV3TIS4VGXDTSLPTTX7RANCNFSM4F635JVQ .[image: https://github.com/notifications/beacon/AINSGR4YLT6AKFHDDPK5L73PTTX7RA5CNFSM4F635JV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODUU6WKI.gif]

ranjitlall commented 5 years ago

No worries, I was just asking how the business is doing and saying we should catch up soon.

Ranjit

From: Alex Stenlake alex.stenlake@gmail.com Date: Monday, May 6, 2019 at 1:21 PM To: Ranjit Lall ranjitlall@hotmail.com Cc: Oracen/MIDAS reply@reply.github.com Subject: Re: [Oracen/MIDAS] GPU utilization in AWS (#9)

I didn't get the previous message. Will look tomorrow evening, got two meetings with VC and CEO tomorrow.

Of course it's not using more than one core, that would require custom architecture-dependent code. He's throwing P3 nodes at code designed for consumer GPUs. The guy's a numpty and if he's so concerned he should code his own infrastructure.

MIDAS is still in limbo while I work on company code. We're close to seed funding, and as mentioned previously; until money is coming in again, all effort is directed to securing income.

Apologies,

Alex

On Mon, May 6, 2019 at 6:57 PM Ranjit Lall ranjitlall@hotmail.com<mailto:ranjitlall@hotmail.com> wrote: Any thoughts about this? Did you get my previous message btw?

Ranjit

From: MarKo9 notifications@github.com<mailto:notifications@github.com> Reply-To: Oracen/MIDAS reply@reply.github.com<mailto:reply%2BAINSGR3Z2VHFYCWNFILM6XV23IVPREVBNHHBMP5PDQ@reply.github.com> Date: Saturday, May 4, 2019 at 3:27 AM To: Oracen/MIDAS MIDAS@noreply.github.com<mailto:MIDAS@noreply.github.com> Cc: Subscribed subscribed@noreply.github.com<mailto:subscribed@noreply.github.com> Subject: Re: [Oracen/MIDAS] GPU utilization in AWS (#9)

My problem was that I was using a huge dataset (42m x 600) and it took forever to impute. However the problem with the library more than the gpu utilization (which I did not solve to answer your question) is that it does a lot of calculations in one core and not vectorized and this is where the big problems with speed comes from.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Oracen/MIDAS/issues/9#issuecomment-489286441, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AINSGR5XRMV3TIS4VGXDTSLPTTX7RANCNFSM4F635JVQ.

MarKo9 commented 5 years ago

Nice manners against people testing your library and bother sharing any issues they found even if they are wrong. BTW i was concerned and I did code my own solution, just not based on your library. A MICE-like xgb based custom imputation was way more accurate at least on my dataset (no need to say about the speed), you can test it against it before your next release.

cmlakhan commented 5 years ago

@MarKo9 do you have code fo this? Would be interested in checking it out. Thanks for the reply @ranjitlall