Open philtomson opened 9 years ago
Hi,
We are considering this as well! The thing is that for company like AMD, they are actually developing environments that is compatible with CUDA in the future: http://www.anandtech.com/show/9792/amd-sc15-boltzmann-initiative-announced-c-and-cuda-compilers-for-amd-gpus. So we are still deciding whether we are going to put our limited human resources into supporint OpenCL. If you are interested, you are more than welcome to help us enhance the project in this direction.
Thank you, Minjie
Hi Minjie,
Maybe we can collaborate on extending MXnet with OpenCL support. we have Opencl caffe open sourced, I guess we can reuse the core kernels?
thanks! Junli On Nov 18, 2015 10:53 AM, "Minjie Wang" notifications@github.com wrote:
Hi,
We are considering this as well! The thing is that for company like AMD, they are actually developing environments that is compatible with CUDA in the future: http://www.anandtech.com/show/9792/amd-sc15-boltzmann-initiative-announced-c-and-cuda-compilers-for-amd-gpus. So we are still deciding whether we are going to put our limited human resources into supporint OpenCL. If you are interested, you are more than welcome to help us enhance the project in this direction.
Thank you, Minjie
— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/621#issuecomment-157820246.
Ha, it's great to hear voice from AMD people here :p. I heard that the problem of integrating OpenCL is mainly due to its support for template which is widely used in mshadow. @tqchen may know more details about this.
Minjie
@gujunli it's very nice to see you here, (we met at icml beijing last year). we are definitely interested on opencl, and hope to support it asap. but the current issue is that we used C++ template while opencl doesn't support it, see https://github.com/dmlc/mshadow/issues/71
template is an issue. On AMD devices we have special keyword to support, no problem. The problem is the same key word does not work on NV GPUs. we are also figuring out a general solution. I would like to hear your thoughts on this.@limu @minjie
junli
On Wed, Nov 18, 2015 at 11:43 AM, Mu Li notifications@github.com wrote:
@gujunli https://github.com/gujunli it's very nice to see you here, (we met at icml beijing last year). we are definitely interested on opencl, and hope to support it asap. but the current issue is that we used C++ template while opencl doesn't support it, see dmlc/mshadow#71 https://github.com/dmlc/mshadow/issues/71
— Reply to this email directly or view it on GitHub https://github.com/dmlc/mxnet/issues/621#issuecomment-157839013.
Junli Gu--谷俊丽 Coordinated Science Lab University of Illinois at Urbana-Champaign
nvidia gpu should be fine with cuda. our main motivation to support opencl is for amd gpus and other devices, such as fpga. for example, altera also contacted us to make mxnet run on their devices.
This won't pose a problem as long as AMD's version of compiler support somewhat similar thing as nvcc does, i.e. template programming and allow integration of host and device code.
What can be done is to have something like tensor_gpu-inl.amd.cc to specialize for AMD's version of keyword. As long as the extra keyword is minimum and the compiler can be detected by marco, it should be fine.
It would be nice to be able to target FPGAs and OpenCL would allow that to be done much more easily through the Altera tool chain.
Also: I'm not sure I understand the templates issue, isn't there a C API that could be used to get around that?
I'll also add that OpenCL would allow targetting Intel Integrated Graphics which is pretty common on a lot of laptops as well as desktops these days.
@philtomson The problem is more for the kernel code. OpenCL uses C as a language for its kernels and MXNet uses C++ for CUDA and CPU kernels and is able to generate both from the same template, which is nice because you don't need to support 2 or 3 different versions of things.
+1 I want to experiment on my laptop which does not have cuda support!
@vchuravy Speak of portability, maybe it's the problem of using template itself in mxnet, because a neural network implement doesn't really need templates for different data types. For a typical neural network implementation, single precision floating point is most commonly used, because double precision is unnecessary and leads to much more computational cost, and half precision computation is not native supported among many devices. Using fixed point data types are completely another case for performance optimization. What people really want is a single efficient, flexible, minimal and yet portable neural network implementation, that can be ported to multiple CPUs, GPUs and FPGAs. The design principle of mxnet meets almost all of these features except the last one.
Is there anyone who tried AMD HIP tools on MXNet?
+1
+1
Really want to see this happen someday for a major Python framework besides Tensorflow (and without using a limited, experimental, proprietary compiler framework). Competition!
https://www.khronos.org/registry/OpenCL/ opencl 2.2 C++ language, including templates support, now is in provisional status. Of course, till now there is no manufacturers releases 2.2 drivers.
This could help convert the cuda kernels to opencl https://github.com/hughperkins/cuda-on-cl
Hi all,
I've been trying to tackle this problem for some time. From my investigation, cocl does not work very well because mshadow is built on Thrust which uses a lot of CUDA host side API that are not supported by cocl. @delijati Therefore, what we found promising is to use VexCL as the vector expression library (instead of mshadow) for GPU device. Currently I have most arithmetic operators on NDArray working but still need to fill in a hell lot of symbolic operators for the whole framework to work. Proof of concept code is here: https://github.com/windywinter/mxnet
Hi all,
I'm looking at PyOpenCL and it could be a solution for MXNet. The challenge that I've observed so far is PyOpenCL requires installation of Intel Open CL SDK on user's machine (if they are running Intel Graphics Card).
An example shared by Easy OpenCL with Python is that Gaston Hillar has demonstrated to use only 12 steps to build and deploy a kernel with PyOpenCL. I've tested his codes and it is working for me.
I wonder if MXNet would consider to support PyOpenCL?
Update: I've tested DeepCL by Hugh Perkins to run using Intel Graphics Card to run Q-Learning and it runs perfectly in Python 2.7: https://github.com/viper7882/DeepCL.
Hugh Perkins has created EasyCL to access OpenCL based GPU @ https://github.com/hughperkins/EasyCL. I'm evaluating if it is possible to merge DeepCL with MXNET. Looks challenging to me to merge the two due to the difference of underlying structure. Any help is appreciated.
Hi @jermainewang ,
Hugh Perkins has provided NVIDIA® CUDA™ cuDNN API for Coriander, OpenCL 1.2 which ideally should be able to interface with existing Mxnet NVIDIA® CUDA™ cuDNN API.
Could you take a look if it make sense to connect Mxnet with OpenCL through this interface?
Also: ROCm/HIP support for mxnet is a thing, might be worth moving wholesale that direction to cover CUDA/HIP ootb, and supporting OpenCL via Coriander. Not sure whether Coriander works on HIP code, but if the HIP is compiled via the CUDA path I don't see why not.. might even reduce the API surface for Coriander to cover?
https://github.com/ROCmSoftwarePlatform/mxnet -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Would like to update on this this can now be done via https://github.com/dmlc/tvm
@tqchen do you mean that TVM supports opencl? I would like to use mxnet with opencl to use ARM GPU (Mali).
yes, TVM support OpenCL, Metal, CUDA, ARM x86 javascript
Hi all,
Guys, can anyone explain why mxnet still doesn't support OpenCL out of the box, even though it is based on nnvm now and through it on tvm, and should be able to perform all necessary computations using OpenCL devices? I checked on nnvm recently and it looks fully up to the task.
But even in the upcoming mxnet 0.12, context mxnet.gpu()
still means only "CUDA" and has no associations with tvm.gpu()
or tvm.cl()
. Why?
Perhaps more than 30% of consumer GPUs around are AMD/Intel-made devices supporting OpenCL >= 1.2. Very often it's a great, inexpensive and less restricted hardware, and making it available for training would greatly benefit ML community.
Any updates on this? @kpot makes a good point above, tvm (and nnvm due to being built off of it) supports opencl, so to me it seems like it shouldn't be too hard to implement opencl as an option. It would be nice to have a timeline for when this can be implemented and if not, what things are blocking it?
Hi All, are there any updates regarding this topic because I would like to see OpenCL be the default for MXNet as well as other ML libraries and frameworks instead of restricting GPU compute only to Nvidia hardware and CUDA?
Do you already provide an installation kit for AMD GPU RX550? Does it work with Windows 10? Does it work with Jupyter, Anaconda and Keras on top of Tensorflow?
+1 waiting for it
Also hoping to have it!
waiting for this...
At the moment there's cloud providers like gpueater pushing the AMD option, which naturally leads towards Keras+PlaidML not MXNet. My ideal would be to be able to take one of the (almost universally AMD-based) cryptocurrency rigs you can pick up for a reasonable price and see what deep learning you can do with it.
Can anybody update us about this?
TensorFlow, PyTorch, MxNet... none of them listen to the users for that need. I've got a Intel card on 3 laptops, using NEO opencl with LuxRender for example and it computes 7x to 20x faster. But for ML, I can't.
OpenCL is not restrictive, open, works on a large variety of card, even Raspberry can use OpenCL, cf. Pi OpenCL.
Please, consider SYSCL for example. We are not all able to pay thunderbolt hardware...
@metal3d contributions welcome. Also see TVM
@leezu excuse me, but your remark seems to not be serious. "Contributions welcome" is like to say "do it if you're so strong".
The core of kernel compilation for machine learning in that kind of framework is "central", that's something that is chosen at the beginning and along the development process.
Contribution by "one guy external to the project" is not possible for that. If I want to work on that:
The problem is that we are asking for OpenCL in a lot of frameworks since months, or years - and there is rarely some answers about:
We don't force authors to use OpenCL, we only wonder why there is nothing done in that direction. Look at that issue, it is opened since 4 years.
4 years !
Look at the question on TF: https://github.com/tensorflow/tensorflow/issues/22
4 years too.
As for Mxet, we never had "clear" answer. No track of something that can explain why and/or how to fix that need. If someone is working on...
Worst: TF has only one active project to compile it with SYSCL. You need to register your user, and try a long compilation that fails 90% of time.
So, sorry if my comment, question, and answers seem to be "aggressive" but 4 years is a bit long without any clear answer like "we won't do that", or "we cannot", or "we will try" and/or why it is not in the way.
So, "contributions welcome"... please... it's like if you're telling to someone in the street "sorry, what time is it" for 5 hours... and after that the man answer "go buy a watch"
I don't see any blocker to add the feature you're requesting, just there's noone willing to work on it. You pointed out the constraints correctly, at lot of ressources are required. Thus my comment is serious. TVM will solve the problem in the not too-far future, so there is no strong incentive to invest resources now into manually writing code targeting OpenCL. Did you take a look at https://docs.tvm.ai/tutorials/get_started.html#generate-opencl-code ?
At first, thanks for your answer.
I don't see any blocker to add the feature you're requesting, just there's noone willing to work on it.
That's the problem we point. The problem that I see (and other than me can see also) is that it seems that major frameworks are trying to make things "faster and easier" before to make the framework more largely usable. That's all we say... That's cool that CUDA is supported and that AWS or Google proposes GPU on demand. But in reality, OpenCL can help to make ML more accessible for modest hardware owners.
And it's now 4 or 5 years that the problem persists. I wish you understand the frustration.
More than that, this give a large monopoly to NVidia that no one seems to want to stop...
As explained in https://towardsdatascience.com/on-the-state-of-deep-learning-outside-of-cudas-walled-garden-d88c8bbb4342 article:
Open source code that targets only a proprietary target is not exactly open open source. We can do better!
And I agree with that.
You said:
Thus my comment is serious.
Excuse me, it could be a translation problem (I'm not English, excuse my bad English BTW), but in French it sounds like "do it yourself". That probably why I answered a bit aggressively.
TVM, no, sorry I didn't know that project and I will take a look. I'm not sure it will resolve the issue, but reading the page you pointed seems to be interesting. Thanks for that.
I hope that you don't take my comment too severely.
@metal3d It's the tradition of where the coders are. Some projects opt to cut resources to the minimum working objective, meaning that integration to a wide variety of choices is left behind and spending money on other things doesn't seem to be a problem though (like nvidia HW). Why do you think people still code these things in windows although it's a terrible platform from it. Tradition. It's the sad reality of resources limitations and mostly tradition of training. RocM now works with mxnet apparently..by using nvcc code lol. I also think openCL should be the way to go, as intel,amd,nvidia, etc are all supported. And I guess for work, I'll be forced to buy a 3x the price nvidia (instead of 3 GPU of same performance) to run my software because most toolkits I use are for cuda. AMD is a rich company and they lagged behind, and now are forced to adapt ROCm to CUDA....instead of having something more generalistic
+1 waiting for it
It would be nice to eventually have OpenCL support for those of us with GPUs that don't do CUDA.