Closed blueberry closed 4 years ago
Yes, I already mentioned this at the bottom of https://forum.byte-welt.net/t/why-jcuda-10-1-requires-usr-local-bfm-lib64-libstdc-so-6-version-cxxabi-1-3-8/21225 - and as the thread suggests: There have been some issues with the dependency versions.
As mentioned in the forum thread, there don't seem to be many updates or changes, so on the one hand, this has "low priority" (because there are no important new features), but also "high priority" because it can probably be done quickly - once I have done the update locally. Maybe I have some time for that in the next days, during the "holidays", but in any case, it's already on the radar.
Just a short note: I've started the update locally. There are some changes of which I'm not entirely sure whether and how they can be (sensibly) mapped to the Java world. Things like NvSciBuf/NvSciSync will likely be omitted. Other features like UVA are about to be added, but I have no idea whether they make sense for Java (or whether they are mainly intended for the "External Memory and Semaphore" operations, which aren't supported in Java anyhow). Also, CUSPARSE has a new library interface for Multi-GPU support and some deprecations - I still have to sort this out, but will continue with that during the weekend and beginning of next week.
With the "usual" delay, the update is done.
The "Release Candidate" that is supposed to be used for building the natives is tagged version-10.2.0-RC00
, which currently matches the master branch.
@blueberry As mentioned in https://github.com/jcuda/jcudnn/issues/4#issuecomment-570054197 , we'll have to sort out this dependency issue for the release. I'll ping the guy who contributed the latest natives at https://forum.byte-welt.net/t/why-jcuda-10-1-requires-usr-local-bfm-lib64-libstdc-so-6-version-cxxabi-1-3-8/21225/20 , maybe he can create them for the new version as well.
(Edit: I wrote him a note at https://forum.byte-welt.net/t/why-jcuda-10-1-requires-usr-local-bfm-lib64-libstdc-so-6-version-cxxabi-1-3-8/21225/21 )
Some sort of "release notes" (or maybe a rant...?) for those who are interested:
There have been some hiccups related to CUSOLVER (not CUSPARSE as I said above). They introduced a new API, cusolverMg
, for multi-GPU solvers. It required a (minor) update of the FindCUDA
CMake script, and the bindings are basically, there, but this is currently totally untested: They did add some snippets showing how to use the Muti-GPU API at https://docs.nvidia.com/cuda/cusolver/index.html#mgsyevd_examples , and I wasted some time trying to port this to Java, but it is ridiculously complicated, and so I deferred this (also because I cannot properly test it anyhow - I only have a single GPU right now).
It seems very unlikely that anybody will use cusolverMg anyhow, and even less from Java (which isn't very motivating, and may explain some of the delay of the update, admittedly...).
Additionally, I noticed that in the CUSOLVER documentation, at some points, they say: "Oh, by the way, this-and-that parameter may be NULL
". They don't do this in the description of the parameters, but in the wall of text that describes what the function is supposed to do. I don't see a reasonable way to figure out which parameters may be NULL
and which ones may not. I gave it a try by removing all NULL
checks, in the hope that the library can figure it out and return a proper error code. But for some parameters, when they are NULL
, CUSOLVER bails out with a segfault. So now the NULL
checks are back in, which might cause a NullPointerException
for some cases even when the parameter can be NULL
. I'll have to wait for the bug reports to figure this out. (If someone wants to read the whole CUSOLVER doc, and send me a list of which parameters may or may not be NULL
, I'd do an update...)
@jcuda Thank you so much Marco for working on this!
I am in the middle of moving to a different appartment, so I can't build this right away, but as soon as I can find a slot, I'll build a Linux + OSX binaries (one of the first days of February most likely).
HI @Jcuda,
I get the following build error when I try to build JCuda 10.2 on Linux:
jcuda.build cmake-gui ➜ jcuda.build make [ 3%] Building CXX object jcuda/JCudaDriverJNI/bin/bin/CMakeFiles/JCudaCommonJNI.dir/src/JNIUtils.cpp.o /home/dragan/workspace/java/jcuda/jcuda-common/JCudaCommonJNI/src/JNIUtils.cpp: In function ‘bool initNative(JNIEnv*, jobjectArray, int**&, bool)’: /home/dragan/workspace/java/jcuda/jcuda-common/JCudaCommonJNI/src/JNIUtils.cpp:538:23: error: ‘nullptr’ was not declared in this scope if (javaObject == nullptr) ^ /home/dragan/workspace/java/jcuda/jcuda-common/JCudaCommonJNI/src/JNIUtils.cpp: In function ‘bool releaseNative(JNIEnv*, int**&, jobjectArray, bool)’: /home/dragan/workspace/java/jcuda/jcuda-common/JCudaCommonJNI/src/JNIUtils.cpp:560:25: error: ‘nullptr’ was not declared in this scope if (nativeObject == nullptr) ^ make[2]: *** [jcuda/JCudaDriverJNI/bin/bin/CMakeFiles/JCudaCommonJNI.dir/build.make:63: jcuda/JCudaDriverJNI/bin/bin/CMakeFiles/JCudaCommonJNI.dir/src/JNIUtils.cpp.o] Error 1 make[1]: *** [CMakeFiles/Makefile2:351: jcuda/JCudaDriverJNI/bin/bin/CMakeFiles/JCudaCommonJNI.dir/all] Error 2 make: *** [Makefile:84: all] Error 2
I have updated my system's CUDA and cuDNN to 10.2.89 and 7.6.5. (I have also manually built gcc 4.8.5 to support older RHEL, but this is not related to this issue because I get this error anyway). I have manually set the few missin references to cuDNN library, include, etc. in cmake-gui.
Everything seems to work well until the make step.
@jcuda FYI changing "nullptr" to "NULL" fixes this build. Was nullptr some sort of typo or it was intentional (in this case, how to fix that)?
Linux binaries (with gcc 4.8.5):
I'll build it for macOS now and will upload it as soon as they are ready.
When will you have time to wrap it up into a release? I plan to release some dependent libraries when JCuda is ready.
macOS binaries:
IMPORTANT:
Thanks @blueberry !
The issue of nullptr
vs. NULL
: The nullptr
keyword is a replacement for NULL
from C++11 onwards (see https://en.cppreference.com/w/cpp/keyword/nullptr ), because NULL
was somehow not properly defined in the language standard or so. This is fixed in https://github.com/jcuda/jcuda-common/commit/c0576702d682d1004c231ba2e9376617cef97eba , and they should semantically be the same here.
The deprecation of MacOS support was already mentioned elsewhere, and we (or rather: The MacOS users) will just have to anticipate that. I'll try to include the MacOS binaries for the last time, despire the doubts of whether they'll actually work - again, there's not much else that we can do.
Regarding the Linux binaries: I haven't heard back from the contributor who provided the Linux binaries with the older dependencies. I pinged him again in https://forum.byte-welt.net/t/why-jcuda-10-1-requires-usr-local-bfm-lib64-libstdc-so-6-version-cxxabi-1-3-8/21225/22 - if there is no response, I'll try to schedule the update (maybe end of this week, but) not later than beginning of next week, and drop a note here when it's done.
@Jcuda regarding the other contributor: you might have missed that, but I already provided binaries that support legacy gcc 4.8.5 that he asked for.
Sorry, I noticed that you mentioned gcc 4.8.5, but wasn't aware that this implies that the right (lower) CXXABI-version will be used (I'm not so familiar with the Linux world, obviously...).
Then I'll try to do the release this week, but again, it should not be later than Monday/Tuesday next week (there's some task during the weekend that might block me for some time).
JCuda 10.2.0 is on its way into Maven Central, and should be available in a few minutes, under the usual coordinates:
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcuda</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcublas</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcufft</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcusparse</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcusolver</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcurand</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jnvgraph</artifactId>
<version>10.2.0</version>
</dependency>
<dependency>
<groupId>org.jcuda</groupId>
<artifactId>jcudnn</artifactId>
<version>10.2.0</version>
</dependency>
As usual, a huge thanks @blueberry for the support and for providing the Linux- and Mac binaries!
(I'm curious to see whether Mac people will stumble over the deprecation of cuDNN on Mac via JCudnn... in that case, however, we'd have to point to version 10.1.0 and to complaints@nvidia.com
;-))
If there are no problems reported with this release, I'll close this issue (and update the website and README) in a few days.
Thank you Marco!
Hello Marco and blueberry,
I am trying to build JCuda 10.2 but couldn't succeed.
Ubuntu 18.04.4 LTS CUDA 10.2 Java 11.0.6 gcc 7.4.0 cmake 3.10.2 cuDNN 7.6
I receive an error like the following:
emel@bsb-workstation:~/jcuda$ cmake ./jcuda-main
-- The C compiler identification is GNU 7.4.0
-- The CXX compiler identification is GNU 7.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda-10.2/bin/nvcc
CMake Error at /usr/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
Could NOT find JNI (missing: JAVA_AWT_INCLUDE_PATH)
Call Stack (most recent call first):
/usr/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.10/Modules/FindJNI.cmake:310 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
../jcuda-common/JCudaCommon_CMake.txt:33 (find_package)
CMakeLists.txt:7 (include)
-- Configuring incomplete, errors occurred!
See also "/home/emel/jcuda/CMakeFiles/CMakeOutput.log".
See also "/home/emel/jcuda/CMakeFiles/CMakeError.log".
I looked around and there were several suggestions for
Could NOT find JNI (missing: JAVA_AWT_INCLUDE_PATH)
and other errors; however, I couldn't resolve this problem.
Is there anyway that I can solve this situation without reverting to older versions of any tools that I've listed initially? Like to java 1.8 or gcc 4.8 etc.
Thanks in advance !
(Just curious: Is there any reason why you don't use the release via Maven? (I'll also upload a single ZIP with all JARs on the website, but this currently has low priority)).
The most likely reason for the error message is:
You might have installed a JRE (Java Runtime Environment) instead of a JDK (Java Development Kit). (I'm not so familiar with Linux/Ubuntu. If this is the case, I'd also have to do some websearches for possible solutions...)
Another possible reason is:
There might have something changed between Java 8 and Java 11 that affects whether CMake can find the required paths. That would be a nuisance.
A drive-by of technical details of CMake that I'm usually not concerned with, but that might be a step towards a solution for both cases: Do you have the directory that is mentioned at https://github.com/Kitware/CMake/blob/master/Modules/FindJNI.cmake#L207 ?
Thank you for the quick response !
Actually, when I first checked maven repository upon reading messages here, I got lost a little bit (not familiar with maven) and thought those folders were missing something compared to the zip files you share on the website. Now I realize that libraries are shared individually. I will use them asap!
About jre-jdk: I checked jdk through checking javac version. So, I think it is not the issue.
I should try modification on the FindJNI.cmake file, though.
Thanks a lot for the kind response.
A side note: The package on the website is still for CUDA 10.1 - the update for 10.2 was recently, and I didn't (yet) manage to upload the package to the website.
Since JCuda does not have any further dependencies, you could still use the JARs directly. But I'd strongly recommend to use Maven: If you only want to use libraries, it's very simple, and you don't have to care about transitive dependencies.
(If you wanted to publish a library on your own in Maven Central, possibly even a library with JNI, I'd say "Welcome to the clusterfLIck of 'convention over configuration'". But if you only want to use Maven libraries, it makes life really simpler....)
I forgot to close this issue. Everything's been working smoothly with 10.2, and some bugs found in 10.1 (regardless of whether they were introduced by JCuda or present in Nvidia's driver itself) have been fixed in 10.2. Thank you a lot for these great libraries, Marco!
BTW, I've just opened a related thread for JOCL (https://github.com/gpu/JOCL/issues/27)
Thanks @blueberry - I appreciate your "heads ups" for new versions, and of course, your contributions. But I'll leave this one open just as a reminder that the READMEs and the website still have to be updated for 10.2.
(It's not much effort, but I've been facing some tight schedules in the past few weeks, and assume that most people will either use Maven+10.1, or figure out that 10.2 is available anyhow, so had to defer this a bit)
Along the same line: Thanks for your pointer to the JOCL issue. In fact, there has been a discussion about HIP support at https://github.com/jcuda/jcuda/issues/5 (and HIP and ROC somehow seem to be related or the same thing, or at least related).
I'll have to re-read the other issue as a refresher - it's been ~4 years since then, and I'm sure a lot has happened in the meantime. Although the sticky note to consider creating "JHip" is still on my table, I'll have to carefully look at the current projects and their structure to see whether I can even consider carve out an appropriate chunk of my spare time for that - I already have the feeling of neglecting too many of my "spare time" projects...
Thanks @jcuda Just to be clear, MIOpen supports OpenCL. I hope it means that it can be supported with the existing JOCL infrastructure, without the need of JHip!
This one was still opened, because the README and website had not been updated with the new version number, but this has become obsolete with the update to CUDA 11.
Hi Marco!
You probably already know that Nvidia released a new version update; I'm just opening this issue as a reference point for (I hope) upcoming support for CUDA 10.2 in JCuda. Any plans for working on this? As usual, I'll build (and test via Neanderthal & ClojureCUDA on Linux) the Linux and MacOS binaries.