Open httpdigest opened 8 years ago
This is in my plans, scheduled for a post 3.0 release.
the person developing LWJGL owns rather a AMD GPU ;)
I've actually upgraded to a GTX a couple of months ago.
I've actually upgraded to a GTX a couple of months ago.
Nice. :) But you do have your old AMD ready in case one needs a standard-conforming card/driver, right? ;)
Yes, I will upgrade the rest of the machine soon and the old parts will be used for Windows and Linux testing. I've been using a Linux VM so far, which is a pain, can't test OpenGL and OpenCL properly.
I just went looking for an issue for this, I started working on some stuff that uses CUDA for a game. Got to do that in C at the moment.
I just wanted to give this issue a +1. I'm currently writing a deeplearning library in Java, as I think the JVM is a great platform for deep learning. I have already written my own bindings for OpenAI's Triton, but for cuda, lwjgl-style bindings would be very nice for consistency. cublas and curand would also be very handy. Mixing jcuda with lwjgl is not a very pleasant experience to say the least.
I got notified about this for some reason. So I'll throw in two cents here:
The CUDA runtime library would only be a tiny, tiny, tiny fraction of "CUDA as a whole". And bindings for the CUDA runtime library alone would be pretty useless: You cannot really do anything useful with that, except for shuffling memory around.
Specifically: You cannot do GPU-based computations with the CUDA runtime library - at least, not from Java. When using the CUDA Runtime API in C, then you define your kernels in .cu
files and call them with the special <<<kernel>>>
syntax. All this is translated into an executable by the nvcc
(NVIDIA CUDA C Compiler). But this doesn't work for Java.
In order to use own kernels in Java, you need the CUDA Driver API. There, you can load pre-compiled kernels (as PTX or CUBIN) and execute them. That's still inconvenient - so you'd probably also want support for the NVRTC, so that you can compile kernels that are given as Java strings, at runtime.
cublas and curand would also be very handy.
And CUSPARSE. And CUSOLVER. And CUFFT. And NPP. And NCCL. And of course: cuDNN, for deep learning.
``` cudnnStatus_t cudnnRNNBackwardData_v8( cudnnHandle_t handle, cudnnRNNDescriptor_t rnnDesc, const int32_t devSeqLengths[], cudnnRNNDataDescriptor_t yDesc, const void *y, const void *dy, cudnnRNNDataDescriptor_t xDesc, void *dx, cudnnTensorDescriptor_t hDesc, const void *hx, const void *dhy, void *dhx, cudnnTensorDescriptor_t cDesc, const void *cx, const void *dcy, void *dcx, size_t weightSpaceSize, const void *weightSpace, size_t workSpaceSize, void *workSpace, size_t reserveSpaceSize, void *reserveSpace); ``` 🤡
Eventually, you'll have many libraries, with ridiculously complex APIs, huge dependencies, and fast-paced changes and release cycles. Keeping these libraries and the surrounding deployment infrastructure up to date would require a team of people working full-time on that. So I fully understand the maintainer(s?) of LWJGL here...
However, the people at https://github.com/bytedeco/javacpp-presets/tree/master/cuda apparently still manage to keep that stuff up to date. I tried out their CUDA bindings once, and everything worked smoothly, so that may be worth a look - and maybe it's easier to get this interoperable with LWJGL than JCuda.
Other than the Driver API and NVRTC bindings, that we already have, there are no plans to support more CUDA APIs in LWJGL 3.
However, assuming CUDA remains as popular as today, LWJGL 4 will eventually add everything useful. Hopefully with Project Valhalla, to sanely map the various CUDA types.
It'd be nice to have bindings for the CUDA C Runtime Library "cudart."
I know this is not a platform-independent solution and the person developing LWJGL owns rather a AMD GPU ;), but the vast amount of effort and energy Nvidia is pumping into CUDA compared to everything else really shows when looking at the tool landscape of compilers, debuggers, profilers and analysis tools and also supporting libraries (cudpp, cublas, cufft, cusparse, thrust, ...), which are mandatory when wanting to develop all but trivial GPU program examples.
Currently, profiling and especially debugging programs for OpenCL and OpenGL Compute Shaders is a real nightmare. :)
I also know there is JCuda, but a LWJGL-solution would be nice, too.