High-impact application of HIP

jyegerlehner commented 8 years ago

Google has contributed to a beta version of Eigen (3.3) that uses CUDA to implement their tensor operations in Tensorflow. If I understand the HIP marketing literature, it allows one to port a CUDA codebase to run on AMD GPUs whilst keeping a mostly CUDA-type codebase.

There is an effort underway to port tensorflow to an opencl back-end, but it appears to have mostly stalled out. At least from what I can tell.

Some of us tensorflowers have found that the fp16 support in Maxwell and Pascal (GP102) GPUs (maybe not GP100, but who has $120K lying around?) is crippled to some extent; runs slower than fp32. fp16, even if it's not faster, allows larger models to fit in memory. I think Vega and Polaris support native fp16. It might be a big win for those of us who would want to run fp16 to have a hippified Eigen backend in Tensorflow.

Sorry, I'm not offering to help. But in case this is news to anyone here, I thought I would pass along the thought.

aditya4d commented 8 years ago

Hi, @bensander can answer it correctly.

gstoner commented 8 years ago

Your wish is being granted, Eigen is being ported over AMD GPU via HIP. The second part of your request is can we bring standardized tool supporting FLOAT16 that ships with all our GFX8 GPU's, wish granted.

Our development branch of AMDGPU compiler now support's both Float16 and Int16 native instruction, instead of emulating FP16/Int16 with up convert & down convert instructions to convert from FP16/Int16 to Float and back.

This is f16 tests on Fiji hardware successfully executing a matrix multiplication with half types with conversion and with Native instructions.

Orig Conversion based: flat_load_ushort v8, v[6:7] flat_load_ushort v9, v[4:5] v_cvt_f32_f16_e32 v8, v8 v_cvt_f32_f16_e32 v9, v9 v_mac_f32_e32 v3, v9, v8

new Native Float : flat_load_ushort v8, v[6:7] flat_load_ushort v9, v[4:5] v_mac_f16_e32 v3, v9, v8

xyang2013 commented 8 years ago

Could you provide more information on when the Eigen port will be available? Also could AMD be more involved in getting Tensorflow to work on their graphic cards? I'm deciding on whether to purchase VEGA or Pascal based cards. Thank you.

bhack commented 8 years ago

Upstream Eigen Tensor now uses syCL. I think that Codeplay computecpp can already target AMD gpu. AMD had in its hand triSYCL (now at xilinx https://github.com/Xilinx/triSYCL). I don't know if there are still syCL plans in AMD.

jyegerlehner commented 8 years ago

Thanks that sounds promising.

Is there any intention to get it integrated such that there will be a tensorflow branch that exploits your hippified Eigen? Or is someone else going to need to do that?

jyegerlehner commented 8 years ago

@gstoner

Our development branch of AMDGPU compiler now support's both Float16 and Int16 native instruction

I haven't kept up with how the pieces all fit together since the switch to this new AMDGPU thing. So we will need to wait until that code makes it into a AMDGPU-Pro version more recent than 16.40 (the version for Ubuntu currently for download on AMD site)? If so, any idea when that might be? And also, please confirm if my understanding is correct that what you're talking about is the compiler that opencl stack uses to compile opencl kernels. Or if that's not right, can you spell it out for us, please?

@bhack Are you saying the opencl support that got merged into Tensorflow has a dependency on something other than a plain opencl stack? I don't know what syCL is.

bhack commented 8 years ago

@jyegerlehner Yes upstream Tensor in Eigen and what landed in TF master some days ago require a sycl implementation: https://www.khronos.org/sycl

LifeIsStrange commented 8 years ago

@xyang2013 AMD put a lot of efforts on opencl, they work on blender, libreoffice, caffecl, etc. If you need opencl/cuda and has a long term vision, choose AMD. Because opencl/syCL will kill cuda because of theses things: Android support. When Lightview AIs will be used on Android opencl will be far more interesting. MacOS support : all modern macs only have AMD or Intel gpus. Performance/price AMD has a better compute architecture. FPGA support. Intel gpus support. And finally : webCL. It is not yet supported by browsers but it will revolutionize websites with machine learning among others things. And openCL has the spir-V avantages (you can use it easily with opengl, Vulkan, and OpenVX and others AIs oriented apis from khronos.) what cuda can't. GPU marketshare of AMD is growing a lot since polaris. And finally in 2017 everybody should buy AMD zen apus, Where their igpu will have the power of a gtx 960, enough for opencl being used on every computers. And nvidia gpu marketshare will die, because the AMD apus will eat all the low end and mid end dgpu marketshare. And they will eat the high end because of crossfire support (multicpu) so your amd dgpu will be able to use the power of your amd igpu similtaneously, what nvidia can't do.

gstoner commented 8 years ago

Float16 and Int16 for for Native GFX8.x based GPUs is in LLVM 4.0 source tree, llvm-mirror/llvm@9027123

LifeIsStrange commented 7 years ago

Hi AMD, as a shareholder i've recently seen your announcement about begining to invest in autonomous cars: Here a thing that could save you a lot of money : Don't reinvent the wheel, improve it ! https://github.com/commaai/openpilot This software pretend to be on par with Tesla ! Reuse it and port it to opencl/syCL/HSA. A lot of cars should soon use this software (once stabilised), if you optimise it for your hardware, they should rationnaly use your hardware.

espadrine commented 7 years ago

@LifeIsStrange AFAIK, commaai is not fully open-source; the critical piece that performs video analysis and path guessing is a binary blob: https://github.com/commaai/openpilot/tree/master/selfdrive/visiond.

But this is off-topic.

rodburns commented 7 years ago

Luke has posted an update on the Codeplay efforts to bringing SYCL to TensorFlow at the original thread. If you have any questions and want to talk to me directly about the work you can reach me on rod at codeplay dot com.

gstoner commented 7 years ago

If you guys saw the Radeon Instinct launch you will find we finaly anounced our big push into Deep Learning. Here is good article http://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017

We will be delivery HIP version of Caffe, Tensorflow, Torch7, MxNet, Theano, CNTK, Chainer, all supporting our new new MIpen our new Deep Learning solver.

Since the everyone is interested in Tensorflow

HipEigen 35/43 GPU tests pass today.
Stream Executor integrated with hipFFT, hipRNG, hipDNN ( MIOpen ) + other interefaces https://github.com/tensorflow/tensorflow/tree/master/tensorflow/stream_executor/cuda
All Kernels in tensorflow/tensorflow/core/kernels/ have been Ported

Note this will run on AMD and NVIDIA hardware

bhack commented 7 years ago

@gstoner How you will be approaching to XLA?

gstoner commented 7 years ago

XLA compiler fronted to our llvm compiler on our port of streamexexcutor to ROCm

Get Outlook for iOShttps://aka.ms/o0ukef

On Fri, Dec 30, 2016 at 12:31 PM -0600, "bhack" notifications@github.com<mailto:notifications@github.com> wrote:

@gstonerhttps://github.com/gstoner How you will be approaching to XLAhttps://autodiff-workshop.github.io/slides/JeffDean.pdf?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/45#issuecomment-269805725, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD8DuR1dXdk2eoL3IjUYJXFRte4xcSz3ks5rNU4NgaJpZM4KXhVY.

bhack commented 7 years ago

Interesting

bhack commented 7 years ago

@gstoner https://www.tensorflow.org/versions/master/experimental/xla/developing_new_backend

aditya4d commented 7 years ago

Closing this issue, can continue the discussion

oscarbg commented 7 years ago

@gstoner Any update on hipDNN or MIOpen? wasn't a Q1 2017 thing? in that case coming in a few days or delayed? thanks..

gstoner commented 7 years ago

We are working hard on it, it has some nice surprises.

greg On Mar 21, 2017, at 6:24 PM, Oscar Barenys notifications@github.com<mailto:notifications@github.com> wrote:

@gstonerhttps://github.com/gstoner Any update on hipDNN or MIOpen? wasn't a Q1 2017 thing? in that case coming in a few days or delayed? thanks..

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/45#issuecomment-288173605, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD8DuQs1zLw-ckRPsI0jF0e4mhcMtgrfks5roBXDgaJpZM4KXhVY.

ROCm / HIP

High-impact application of HIP #45