bytedeco / javacv

Java interface to OpenCV, FFmpeg, and more
Other
7.39k stars 1.56k forks source link

A fatal error has been detected by the Java Runtime Environment - new pc #2236

Closed gareth-edwards closed 1 month ago

gareth-edwards commented 1 month ago

I've set up a new pc with a project that was working on a different machine. I am getting "A fatal error has been detected by the Java Runtime Environment". I guess I need to add something, but am stuck as to what.

The fatal error seems to occur quite easily. Two calls that produce it are:

opencv_core.useOpenCL() , and opencv_imgproc.getStructuringElement(opencv_imgproc.MORPH_RECT, new Size(2, 2)).getUMat(ACCESS_READ);

I'm using Eclipse and OpenJDK 8.

My POM file has:

    <dependency>
        <groupId>org.bytedeco</groupId>
        <artifactId>javacv-platform</artifactId>
        <version>1.5.10</version>
    </dependency>

    <dependency>
        <groupId>org.bytedeco</groupId>
        <artifactId>opencv-platform</artifactId>
        <version>4.9.0-1.5.10</version>
    </dependency>

    <dependency>
        <groupId>org.bytedeco</groupId>
        <artifactId>opencv-platform-gpu</artifactId>
        <version>4.9.0-1.5.10</version>
    </dependency>

    <dependency>
        <groupId>org.bytedeco</groupId>
        <artifactId>ffmpeg-platform</artifactId>
        <version>6.1.1-1.5.10</version>
    </dependency>

I've tried to get debug logs with:

        System.setProperty("org.bytedeco.javacpp.logger.debug", "true");
        //System.setProperty("org.bytedeco.javacpp.logger", "slf4j");  // stops seeing debug logs?
        Logger logger = LoggerFactory.getLogger(Main.class);

These are the logs:

Starting... Calling opencv_core.haveOpenCL()...

Debug: Loading class org.bytedeco.javacpp.presets.javacpp Debug: Loading class org.bytedeco.openblas.global.openblas_nolapack Debug: Loading class org.bytedeco.javacpp.presets.javacpp Debug: Loading class org.bytedeco.openblas.global.openblas_nolapack Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libgcc_s.so.1 Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libquadmath.so.0 Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libgfortran.so.5 Debug: Loading library gfortran Debug: Failed to load for gfortran@.4: java.lang.UnsatisfiedLinkError: no gfortran in java.library.path Debug: Loading library gfortran Debug: Failed to load for gfortran@.3: java.lang.UnsatisfiedLinkError: no gfortran in java.library.path Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libopenblas_nolapack.so.0 Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libjniopenblas_nolapack.so Debug: Loading class org.bytedeco.javacpp.presets.javacpp Debug: Loading class org.bytedeco.javacpp.Pointer Debug: Loading /home/gareth/.javacpp/cache/javacpp-1.5.10-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64/libjnijavacpp.so Debug: Loading class org.bytedeco.openblas.global.openblas Debug: Loading class org.bytedeco.javacpp.presets.javacpp Debug: Loading class org.bytedeco.openblas.global.openblas_nolapack Debug: Loading class org.bytedeco.openblas.global.openblas Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libopenblas.so.0 Debug: Loading /home/gareth/.javacpp/cache/openblas-0.3.26-1.5.10-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64/libjniopenblas.so Debug: Loading class org.bytedeco.opencv.global.opencv_core Debug: Loading library cudart Debug: Failed to load for cudart@.12: java.lang.UnsatisfiedLinkError: no cudart in java.library.path Debug: Loading library cublasLt Debug: Failed to load for cublasLt@.12: java.lang.UnsatisfiedLinkError: no cublasLt in java.library.path Debug: Loading library cublas Debug: Failed to load for cublas@.12: java.lang.UnsatisfiedLinkError: no cublas in java.library.path Debug: Loading library cufft Debug: Failed to load for cufft@.11: java.lang.UnsatisfiedLinkError: no cufft in java.library.path Debug: Loading library cudnn Debug: Failed to load for cudnn@.8: java.lang.UnsatisfiedLinkError: no cudnn in java.library.path Debug: Loading library nppc Debug: Failed to load for nppc@.12: java.lang.UnsatisfiedLinkError: no nppc in java.library.path Debug: Loading library nppial Debug: Failed to load for nppial@.12: java.lang.UnsatisfiedLinkError: no nppial in java.library.path Debug: Loading library nppicc Debug: Failed to load for nppicc@.12: java.lang.UnsatisfiedLinkError: no nppicc in java.library.path Debug: Loading library nppicom Debug: Failed to load for nppicom@.12: java.lang.UnsatisfiedLinkError: no nppicom in java.library.path Debug: Loading library nppidei Debug: Failed to load for nppidei@.12: java.lang.UnsatisfiedLinkError: no nppidei in java.library.path Debug: Loading library nppif Debug: Failed to load for nppif@.12: java.lang.UnsatisfiedLinkError: no nppif in java.library.path Debug: Loading library nppig Debug: Failed to load for nppig@.12: java.lang.UnsatisfiedLinkError: no nppig in java.library.path Debug: Loading library nppim Debug: Failed to load for nppim@.12: java.lang.UnsatisfiedLinkError: no nppim in java.library.path Debug: Loading library nppist Debug: Failed to load for nppist@.12: java.lang.UnsatisfiedLinkError: no nppist in java.library.path Debug: Loading library nppisu Debug: Failed to load for nppisu@.12: java.lang.UnsatisfiedLinkError: no nppisu in java.library.path Debug: Loading library nppitc Debug: Failed to load for nppitc@.12: java.lang.UnsatisfiedLinkError: no nppitc in java.library.path Debug: Loading library npps Debug: Failed to load for npps@.12: java.lang.UnsatisfiedLinkError: no npps in java.library.path Debug: Loading library cudnn_ops_infer Debug: Failed to load for cudnn_ops_infer@.8: java.lang.UnsatisfiedLinkError: no cudnn_ops_infer in java.library.path Debug: Loading library cudnn_ops_train Debug: Failed to load for cudnn_ops_train@.8: java.lang.UnsatisfiedLinkError: no cudnn_ops_train in java.library.path Debug: Loading library cudnn_adv_infer Debug: Failed to load for cudnn_adv_infer@.8: java.lang.UnsatisfiedLinkError: no cudnn_adv_infer in java.library.path Debug: Loading library cudnn_adv_train Debug: Failed to load for cudnn_adv_train@.8: java.lang.UnsatisfiedLinkError: no cudnn_adv_train in java.library.path Debug: Loading library cudnn_cnn_infer Debug: Failed to load for cudnn_cnn_infer@.8: java.lang.UnsatisfiedLinkError: no cudnn_cnn_infer in java.library.path Debug: Loading library cudnn_cnn_train Debug: Failed to load for cudnn_cnn_train@.8: java.lang.UnsatisfiedLinkError: no cudnn_cnn_train in java.library.path Debug: Loading /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libopencv_cudev.so.409 Debug: Loading /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libopencv_core.so.409 Debug: Loading /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libopencv_imgproc.so.409 Debug: Loading /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libjniopencv_core.so Debug: Loading class org.bytedeco.javacpp.presets.javacpp Debug: Loading class org.bytedeco.openblas.global.openblas_nolapack Debug: Loading class org.bytedeco.openblas.global.openblas Debug: Loading class org.bytedeco.opencv.global.opencv_core Debug: Loading class org.bytedeco.opencv.opencv_core.CvSlice Debug: Registering org.bytedeco.opencv.opencv_core.CvSlice[address=0x711e38b24eb0,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x711e38b24eb0,deallocatorAddress=0x711e10329d60]] Debug: Registering org.bytedeco.opencv.opencv_core.CvSlice[address=0x711e38b3d000,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x711e38b3d000,deallocatorAddress=0x711e10329d60]]

Calling opencv_core.useOpenCL()...

#

A fatal error has been detected by the Java Runtime Environment:

#

SIGSEGV (0xb) at pc=0x0000000000000000, pid=20862, tid=0x0000711e411df640

#

JRE version: OpenJDK Runtime Environment (8.0_402-b06) (build 1.8.0_402-8u402-ga-2ubuntu1~22.04-b06)

Java VM: OpenJDK 64-Bit Server VM (25.402-b06 mixed mode linux-amd64 compressed oops)

Problematic frame:

C 0x0000000000000000

Many thanks in advance!

saudet commented 1 month ago

Please try to set the "org.bytedeco.javacpp.nopointergc" system property to "true".

gareth-edwards commented 1 month ago

Thanks for the suggestion! I compared the debug logs between both pcs, with and without the nopointergc property. I can see the logs are identical on both pcs, and setting the "org.bytedeco.javacpp.nopointergc" system property to "true" ommits the following from the log:

Debug: Registering org.bytedeco.opencv.opencv_core.CvSlice[address=0x7f20a8b8e600,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x7f20a8b8e600,deallocatorAddress=0x7f208033ed60]] Debug: Registering org.bytedeco.opencv.opencv_core.CvSlice[address=0x7f20a8b08240,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x7f20a8b08240,deallocatorAddress=0x7f208033ed60]]

However the new pc still produces the "A fatal error has been detected by the Java Runtime Environment" (with the org.bytedeco.javacpp.nopointergc" system property to "true"; it's still ok on the old pc regardless of the property)

I was hoping the log comparison would show me something different to look into, but since they are the same I am completely stumped!

Only differences I can think of is the newer pc with the fatal error is an AMD with Ubuntu 22.04 using the nvidia 340.108 display driver for a GeForce 210 GT218, the old pc with it working is an Intel with Ubuntu 20.04 using the intel i915 display driver on intel integrated graphics (really slow!)

Is there anything else I can try?

Thanks again very much in advance!

gareth-edwards commented 1 month ago

I have tried:

javacv-platform, 1.5.9, 1.5.10, 1.5.11-SNAPSHOT and opencv-platform, 4.7.0-1.5.9, 4.9.0-1.5.10, 4.9.0-1.5.11-SNAPSHOT

Wish I knew how to look into where the problem may be. I can see opencv_core.useOpenCL() calls @Namespace("cv::ocl") public static native @Cast("bool") boolean useOpenCL(); (which then produces the error). That's as far as I can go!

Are there any dependencies that need to be installed on a new pc? https://github.com/bytedeco/javacv?tab=readme-ov-file#required-software suggests not, but I did try installing opencl (no improvement).

Thanks, and all the best.

gareth-edwards commented 1 month ago

I have tried:

Different jdks: open-jdk-8 and org.eclipse.justj.openjdk.hotspot.jre.full.linux.x86_64_17.0.10.v20240120-1143/jre Compiling the code on the old, working, pc copying to the new, erroring pc, and running it there Installing opencl, opencv, ffmpeg

Still the fata error / segfault

Trying the demo code on https://github.com/bytedeco/javacv?tab=readme-ov-file#required-software ; Smoother works, but the Demo class produces the segfault on classifier.detectMultiScale(grayImage, faces); (line 94)

The hs_err_pid.log is very long. Don't know which of it is most relevant, but here are some entries that may point things out for the fail in the Demo class:

Register to memory mapping:

RDX=0x0000768448761b10: clRetainDevice_pfn+0 in /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libopencv_core.so.409 at 0x0000768447c00000 R8 =0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000 R9 =0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000 R10=0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000

Stack: [0x00007685fa500000,0x00007685fa600000], sp=0x00007685fa5fc928, free space=1010k Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.bytedeco.opencv.opencv_objdetect.CascadeClassifier.detectMultiScale(Lorg/bytedeco/opencv/opencv_core/Mat;Lorg/bytedeco/opencv/opencv_core/RectVector;)V+0 j Demo.main([Ljava/lang/String;)V+432 v ~StubRoutines::call_stub

Internal exceptions (10 events): Event: 3.504 Thread 0x00007685f4015000 Exception <a 'java/lang/reflect/InvocationTargetException'> (0x00000000d7f69a08) thrown at [./src/hotspot/src/share/vm/runtime/reflection.cpp, line 1090] Event: 4.118 Thread 0x00007685f4015000 Exception <a 'java/io/FileNotFoundException'> (0x00000000dcc83138) thrown at [./src/hotspot/src/share/vm/prims/jni.cpp, line 709] Event: 4.140 Thread 0x00007685f4015000 Exception <a 'java/lang/ClassNotFoundException': sun/dc/DuctusRenderingEngine> (0x00000000dccdd1e8) thrown at [./src/hotspot/src/share/vm/classfile/systemDictionary.cpp, line 217] Event: 4.197 Thread 0x00007685f4015000 Implicit null exception at 0x00007685e5399036 to 0x00007685e5399521 Event: 4.197 Thread 0x00007685f4015000 Implicit null exception at 0x00007685e5389774 to 0x00007685e5389c29 Event: 4.200 Thread 0x00007685f4015000 Implicit null exception at 0x00007685e580abf4 to 0x00007685e580ae75 Event: 4.200 Thread 0x00007685f4015000 Implicit null exception at 0x00007685e56562da to 0x00007685e565683d Event: 4.201 Thread 0x00007685f4015000 Implicit null exception at 0x00007685e5378f2f to 0x00007685e5378f9d Event: 4.248 Thread 0x00007685f614f800 Implicit null exception at 0x00007685e57be7ec to 0x00007685e57befb9 Event: 4.413 Thread 0x00007685f614f800 Exception <a 'java/lang/UnsupportedOperationException': > (0x00000000dd7befe8) thrown at [./src/hotspot/src/share/vm/prims/jni.cpp, line 736]

saudet commented 1 month ago

RDX=0x0000768448761b10: clRetainDevice_pfn+0 in /home/gareth/.javacpp/cache/opencv-4.9.0-1.5.10-linux-x86_64-gpu.jar/org/bytedeco/opencv/linux-x86_64-gpu/libopencv_core.so.409 at 0x0000768447c00000 R8 =0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000 R9 =0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000 R10=0x0000768440000eb8: <offset 0xeb8> in /lib/x86_64-linux-gnu/libOpenCL.so at 0x0000768440000000

Sounds to me like that library for OpenCL is bad. Please try another one

gareth-edwards commented 1 month ago

Many thanks for the suggestion!

I was able to get the code running by switching the display driver from nvidia back to nouveau. This disabled opencl, so it works like that, but is very slow...

Well that lead me to explore where the OpenCL was coming from and how to try different ones. I had installed the opencl-dev package from the main Ubuntu repos but when I looked into (re-installing?) it with "apt install opencl-dev", I saw:

Note, selecting 'ocl-icd-opencl-dev' instead of 'opencl-dev' ocl-icd-opencl-dev is already the newest version (2.2.14-3).

Then "apt-cache policy ocl-icd-opencl-dev" showed me that ocl-icd-opencl-dev was also from the main Ubuntu repos.

So then a "apt search opencl" showed me that nvidia-opencl-dev was available so I installed that, but its also from the main Ubuntu repos, so having switched back to the nvidia driver to try that, it segfaults again with the same log entry about /lib/x86_64-linux-gnu/libOpenCL.so

I'll keep trying to see how to get it to use a different OpenCL library. From what I read it seems this comes from the graphics card driver, but I'm not 100% sure. The dates on the /lib/x86_64-linux-gnu/libOpenCL.so files are September 2021; I don't really know what to make of that.

The graphics card is also very old (GeForce 210 GT218) and out of support so the driver came from https://ppa.launchpadcontent.net/kelebek333/nvidia-legacy/ubuntu

I'll see if there are any different drivers or ways to change OpenCL, and then try a newer graphics card.

gareth-edwards commented 1 month ago

Bingo!

So I looked into the packages in the repo for the legacy nvidia driver: https://launchpad.net/~kelebek333/+archive/ubuntu/nvidia-legacy/+packages

And noticed nvidia-libopencl1-340 "NVIDIA OpenCL Driver and ICD Loader library" when I began to install that with apt, under "The following packages were automatically installed and are no longer required:" it mentioned a long list of packages including opencl-c-headers opencl-clhpp-headers, under "The following packages will be REMOVED" it mentions nvidia-opencl-dev ocl-icd-libopencl1 ocl-icd-opencl-dev, and under "The following NEW packages will be installed" it included nvidia-libopencl1-340

So I thought that was very promising and went ahead with it.

The good news is: it does not segfault any more! (the bad news is: it's still slow :( opencv_core.haveOpenCL() returns true and even with opencv_core.setUseOpenCL(true) and System.load("/usr/lib/x86_64-linux-gnu/libOpenCL.so.1"); then opencv_core.useOpenCL() returns false. I'll look into this next)

I believe I have learnt more about opencl and to install the one from any new driver.

Many thanks for that, and such a great library!

gareth-edwards commented 1 month ago

Worked immediately with a newer, in-support, graphics card with the Nvidia driver from the standard Ubuntu repos. Even had opencl support! (About 25% faster)