doe300 / VC4CL

OpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models
MIT License
728 stars 80 forks source link

Error building VC4CL #105

Open TG9541 opened 2 years ago

TG9541 commented 2 years ago

Hi,

first of all thanks for this great project!

I'm currently trying to build VC4CL on a Raspberry Pi Zero following this description using the most recent Raspbian version (see below) after a normal `apt update && apt upgrade". Memory for linking VC4C was a bit low initially but making the swap file bigger helped.

Unfortunately I'm getting the error below in a late phase of building VC4CL - I've double checked library installation etc. but to no avail.

Any help would be greatly appreciated!

Kind regards, Thomas

[ 82%] Linking CXX shared library libVC4CL.so
[ 82%] Built target VC4CL
[ 85%] Building CXX object tools/CMakeFiles/vc4cl_dump_analyzer.dir/DumpAnalyzer.cpp.o
[ 88%] Linking CXX executable vc4cl_dump_analyzer
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `bcm_host_get_processor_id'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_malloc_cache'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_exit'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `bcm_host_get_model_type'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_gencmd'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_gencmd_stop'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_vchi_gencmd_init'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vchi_disconnect'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_lock'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_gpuserv_execute_code'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `bcm_host_deinit'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_unlock_ptr'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_free'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `bcm_host_init'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_clean_invalid2'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vchi_initialise'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vchi_connect'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `bcm_host_get_peripheral_address'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_vc_addr_from_hdl'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_gpuserv_deinit'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vc_gpuserv_init'
/usr/bin/ld: ../src/libVC4CL.so.0.4.9999: undefined reference to `vcsm_init_ex'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/vc4cl_dump_analyzer.dir/build.make:105: tools/vc4cl_dump_analyzer] Error 1
make[1]: *** [CMakeFiles/Makefile2:278: tools/CMakeFiles/vc4cl_dump_analyzer.dir/all] Error 2
make: *** [Makefile:171: all] Error 2

Raspbian version:

thomas@zero:~/opencl/VC4CL/build $ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 11 (bullseye)"
NAME="Raspbian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
doe300 commented 2 years ago

What is the contents of your /opt/vc/lib directory? If you don't have a libbcm_host.so and libvcsm.so in there, you might be missing some Raspberry Pi firmware package.

TG9541 commented 2 years ago

You're right - libvcsm.so is missing, and in fact /opt/ is empty! I had believed that libraspberrypi-dev contains all the necessary dependencies, and I double-checked that it is installed. Do you have a hint which package I might have missed?

Edit: it's in /usr/lib/arm-linux-gnueabihf/libvcsm.so - I'll try working with a symlink

TG9541 commented 2 years ago

Ok, both files had been in /usr/lib/arm-linux-gnueabihf/.

I've downloaded https://github.com/raspberrypi/firmware created a symlink. I'm still getting the same error.

The contents of '/opt/vc/lib/` is now the following:

thomas@zero:~/opencl/VC4CL/build $ ls /opt/vc/lib
libbcm_host.so    libcontainers.so       libEGL_static.a     libkhrn_client.a       libmmal_util.so       libvchostif.a  pkgconfig
libbrcmEGL.so     libdebug_sym.so        libelftoolchain.so  libkhrn_static.a       libmmal_vc_client.so  libvcilcs.a    plugins
libbrcmGLESv2.so  libdebug_sym_static.a  libGLESv1_CM.so     libmmal_components.so  libopenmaxil.so       libvcos.so
libbrcmOpenVG.so  libdtovl.so            libGLESv2.so        libmmal_core.so        libOpenVG.so          libvcsm.so
libbrcmWFC.so     libEGL.so              libGLESv2_static.a  libmmal.so             libvchiq_arm.so       libWFC.so
TG9541 commented 2 years ago

Note: https://github.com/doe300/VC4CL/issues/69 references https://github.com/doe300/VC4CL/issues/53

I'll try what @Tritbool proposed, "install userland from source": https://github.com/doe300/VC4CL/issues/53#issuecomment-507254718

doe300 commented 2 years ago

Are you working on a clean (new) installation of Debian Bullseye? If so, maybe the libraries were moved recently, I will have to check the Raspberry Pi packages... In the meantime, you could modify https://github.com/doe300/VC4CL/blob/master/src/CMakeLists.txt line 30 and 31 to look for the libraries in /usr/lib/arm-linux-gnueabihf/ instead.

TG9541 commented 2 years ago

The Bullseye installation was reasonably fresh and clean (I installed a few standard packages unrelated to machine level development). The problem persisted after building "userland" from source - I decided to start over with "make clean" of VC4CLStdLib, VC4C and VC4CL. That takes a few hours and the machine is still building.

TG9541 commented 2 years ago

This time it worked :-)

The following should result in a working build:

  1. build and install userland
  2. follow the instructions here

For testing the work-around that you proposed I've changed CMakeList.txt so that it points to /usr/lib/arm-linux-gnueabihf - I'll let you know the result.

For future reference:

I used the following versions:

Here is the result of clinfo from my Pi Zero:

Number of platforms                               1
  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
  Platform Vendor                                 doe300
  Platform Version                                OpenCL 1.2 VC4CL 0.4.9999 (1acb1b8)
  Platform Profile                                EMBEDDED_PROFILE
  Platform Extensions                             cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters
  Platform Extensions function suffix             VC4CL

  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices                                 1
  Device Name                                     VideoCore IV GPU
  Device Vendor                                   Broadcom
  Device Vendor ID                                0x14e4
  Device Version                                  OpenCL 1.2 VC4CL 0.4.9999 (1acb1b8)
  Device Numeric Version                          0x402000 (1.2.0)
  Driver Version                                  0.4.9999
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Profile                                  EMBEDDED_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Available core IDs                              0
  Max clock frequency                             300MHz
  Core Temperature (Altera)                       36 C
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             12x12x12
  Max work group size                             12
  Preferred work group size multiple (kernel)     1
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                               16 / 16      
    int                                                 16 / 16      
    long                                                 0 / 0       
    half                                                 0 / 0        (n/a)
    float                                               16 / 16      
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 Yes
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    32, Little-Endian
  Global memory size                              268435456 (256MiB)
  Error Correction support                        No
  Max memory allocation                           134217728 (128MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             64 bytes
  Alignment of base address                       512 bits (64 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768 (32KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   No
  Local memory type                               Global
  Local memory size                               268435456 (256MiB)
  Max number of constant args                     32
  Max constant buffer size                        268435456 (256MiB)
  Max size of kernel argument                     256
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    IL version                                    SPIR-V_1.5 SPIR_1.2
    ILs with version                              SPIR                                                             0x402000 (1.2.0)
                                                  SPIR-V                                                           0x405000 (1.5.0)
    SPIR versions                                 1.2
  printf() buffer size                            0
  Built-in kernels                                (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_nv_pragma_unroll cl_arm_core_id cl_ext_atomic_counters_32 cl_khr_initialize_memory cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16 cl_arm_integer_dot_product_accumulate_saturate_int8 cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters
  Device Extensions with Version                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                                     0 (0.0.0)
                                                  cl_arm_core_id                                                   0x800000 (2.0.0)
                                                  cl_ext_atomic_counters_32                                        0x1400000 (5.0.0)
                                                  cl_khr_initialize_memory                                         0x400000 (1.0.0)
                                                  cl_arm_integer_dot_product_int8                                  0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_int8                       0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_int16                      0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_saturate_int8              0xc00000 (3.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spir                                                      0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_altera_device_temperature                                            0 (0.0.0)
                                                  cl_altera_live_object_tracking                                          0 (0.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                                 0 (0.0.0)
                                                  cl_khr_suggested_local_work_size                                 0x400000 (1.0.0)
                                                  cl_vc4cl_performance_counters                                           0 (0.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  OpenCL for the Raspberry Pi VideoCore IV GPU
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [VC4CL]
  clCreateContext(NULL, ...) [default]            Success [VC4CL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.14
  ICD loader Profile                              OpenCL 3.0
TG9541 commented 2 years ago

After changing src/CMakeLists.txt building VC4CL worked.

I haven't tested, though, if the problem comes from VC4CLStdLib or VC4C.

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 99ed99e..b1e6942 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -27,8 +27,10 @@ endif()
 if(MOCK_HAL)
        target_compile_definitions(VC4CL PRIVATE MOCK_HAL=1)
 elseif(CROSS_COMPILE OR EXISTS "/opt/vc/include/bcm_host.h")
-       find_library(BCMHOST_LIBRARY NAMES bcm_host libbcm_host HINTS "/opt/vc/lib")
-       find_library(VCSM_LIBRARY NAMES vcsm libvcsm HINTS "/opt/vc/lib")
+       # find_library(BCMHOST_LIBRARY NAMES bcm_host libbcm_host HINTS "/opt/vc/lib")
+       # find_library(VCSM_LIBRARY NAMES vcsm libvcsm HINTS "/opt/vc/lib")
+       find_library(BCMHOST_LIBRARY NAMES bcm_host libbcm_host HINTS "usr/lib/arm-linux-gnueabihf")
+       find_library(VCSM_LIBRARY NAMES vcsm libvcsm HINTS "usr/lib/arm-linux-gnueabihf")
        target_link_libraries(VC4CL ${BCMHOST_LIBRARY} ${VCSM_LIBRARY} ${SYSROOT_LIBRARY_FLAGS})
 endif()
 if(ENABLE_COVERAGE)
doe300 commented 2 years ago

Thanks for the detailed analysis.

Looks like that at least for some Ubuntu ports, the libraries where moved, I will try to find out whether this happened for official builds also and update the CMake files.

PaxJaromeMalues commented 2 years ago

Apparently we have some more moved libraries:

Scanning dependencies of target VC4C
[ 56%] Building CXX object src/CMakeFiles/VC4C.dir/main.cpp.o
[ 56%] Linking CXX executable vc4c
/home/USER/buildarea/opencl/VC4C/build/src/vc4c: error while loading shared libraries: /home/USER/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2: unexpected reloc type 0x03
make[2]: *** [src/CMakeFiles/VC4C.dir/build.make:88: src/vc4c] Error 127
make[2]: *** Deleting file 'src/vc4c'
make[1]: *** [CMakeFiles/Makefile2:2018: src/CMakeFiles/VC4C.dir/all] Error 2
make: *** [Makefile:163: all] Error 2

Running on a RPi 2B 1GB ubuntu server 20.04.3 LTS

doe300 commented 2 years ago

That looks strange... Does the error still occur if you do a clean rebuild of the VC4C project?

PaxJaromeMalues commented 2 years ago

As far as I tried: yes. make went ahead after a reboot to start the build from ground up until this reoccured. I also build userland from source, did not help. I will delete the entire git clone now and try a complete rebuild. But it very likely to end up the same way.

PaxJaromeMalues commented 2 years ago

I already looked into the previous messages and tried to find something in the CMakeFiles.txt but its just full of strings and variables I have no idea about what those mean. Nothing about finding a library so-file

doe300 commented 2 years ago

As far as I can see it, the error message occurs when the libVC4CC.so.1.2 library is linked into the VC4C executable. But since they both have the same compilation options, I don't know why it would fail...

What compiler/linker are you using? What do ldd /home/USER/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2 and file -L /home/USER/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2 output?

PaxJaromeMalues commented 2 years ago

Sorry for the delay, I had friends over after I typed the last one. ldd /home/USER/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2

user@ubuntu:~$ ldd /home/user/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2
        linux-vdso.so.1 (0xbeff7000)
        libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0xb6025000)
        libLLVM-10.so.1 => /lib/arm-linux-gnueabihf/libLLVM-10.so.1 (0xb2214000)
        libstdc++.so.6 => /lib/arm-linux-gnueabihf/libstdc++.so.6 (0xb20cb000)
        libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xb2062000)
        libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xb2039000)
        libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xb1f3b000)
        /lib/ld-linux-armhf.so.3 (0xb6fcb000)
        libffi.so.7 => /lib/arm-linux-gnueabihf/libffi.so.7 (0xb1f25000)
        libedit.so.2 => /lib/arm-linux-gnueabihf/libedit.so.2 (0xb1ef5000)
        libz.so.1 => /lib/arm-linux-gnueabihf/libz.so.1 (0xb1ed2000)
        librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0xb1ebc000)
        libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0xb1ea9000)
        libtinfo.so.6 => /lib/arm-linux-gnueabihf/libtinfo.so.6 (0xb1e7c000)
        libbsd.so.0 => /lib/arm-linux-gnueabihf/libbsd.so.0 (0xb1e59000)

file -L /home/USER/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2

user@ubuntu:~$ file -L /home/user/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2
/home/user/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=fddc10984ae4cf636b7b907db2e79da5b0a289f1, with debug_info, not stripped

FYI I tried and entire clean install from source and it also failed with the same error message as before.

Thanks alot for taking your time with a random dude seeking help :)

PaxJaromeMalues commented 2 years ago

Do you need any other information that I can provide?

doe300 commented 2 years ago

So I just checked on my (a little bit dated) Raspberry Pi OS. Besides me using LLVM 6 instead of 10, the only major difference I can see is that my library additionally links to

/usr/lib/arm-linux-gnueabihf/libarmmem-${PLATFORM}.so
libatomic.so.1

Can you try 2 more things?

  1. What does readelf -r /home/user/buildarea/opencl/VC4C/build/src/libVC4CC.so.1.2 | grep R_ARM_REL32 output? This should basically list the symbols that have this wrong linkage type.
  2. Can you post the contents of /home/user/buildarea/opencl/VC4C/build/src/CMakeFiles/VC4CC.dir/link.txt? Does the link command contain any static library (something.a)?
PaxJaromeMalues commented 2 years ago

1.

user@ubuntu:~$ readelf -r /home/user/buildarea/opencl/VC4C/bui                                                                                                                                                             ld/src/libVC4CC.so.1.2 | grep R_ARM_REL32
00e4c990  0055ad03 R_ARM_REL32       00f45f18   _ZTV7Message
00e4d024  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbdc  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbe0  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbe4  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbe8  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbec  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbf0  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbf4  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbf8  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dbfc  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc00  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc04  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc08  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc0c  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc10  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc14  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc18  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc1c  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4dc20  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e094  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e098  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e09c  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e0a0  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e0a4  0099ca03 R_ARM_REL32       00f46060   _ZN9Validator3MSGE
00e4e9fc  0083f903 R_ARM_REL32       00f48eac   _ZTVN9Validator7Messag
00e4eb3c  0083f903 R_ARM_REL32       00f48eac   _ZTVN9Validator7Messag
00e518d4  00560a03 R_ARM_REL32       00f7fac8   _Z7exepathB5cxx11
00e52424  00560a03 R_ARM_REL32       00f7fac8   _Z7exepathB5cxx11
00e52430  00560a03 R_ARM_REL32       00f7fac8   _Z7exepathB5cxx11

2.

user@ubuntu:~$ cat /home/user/buildarea/opencl/VC4C/build/src/CMakeFiles/VC4CC.dir/link.txt
/usr/bin/c++ -fPIC -g3 -rdynamic  -shared -Wl,-soname,libVC4CC.so.1.2 -o libVC4CC.so.0.4.9999 CMakeFiles/VC4CC.dir/BasicBlock.cpp.o CMakeFiles/VC4CC.dir/CompilationError.cpp.o CMakeFiles/VC4CC.dir/Compiler.cpp.o CMakeFiles/VC4CC.dir/Disassembler.cpp.o CMakeFiles/VC4CC.dir/Expression.cpp.o CMakeFiles/VC4CC.dir/GlobalValues.cpp.o CMakeFiles/VC4CC.dir/HalfType.cpp.o CMakeFiles/VC4CC.dir/InstructionWalker.cpp.o CMakeFiles/VC4CC.dir/Locals.cpp.o CMakeFiles/VC4CC.dir/Method.cpp.o CMakeFiles/VC4CC.dir/Module.cpp.o CMakeFiles/VC4CC.dir/ProcessUtil.cpp.o CMakeFiles/VC4CC.dir/Profiler.cpp.o CMakeFiles/VC4CC.dir/Register.cpp.o CMakeFiles/VC4CC.dir/signals.cpp.o CMakeFiles/VC4CC.dir/SIMDVector.cpp.o CMakeFiles/VC4CC.dir/ThreadPool.cpp.o CMakeFiles/VC4CC.dir/Types.cpp.o CMakeFiles/VC4CC.dir/Values.cpp.o CMakeFiles/VC4CC.dir/shared/BinaryHeader.cpp.o CMakeFiles/VC4CC.dir/analysis/AvailableExpressionAnalysis.cpp.o CMakeFiles/VC4CC.dir/analysis/ControlFlowGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/ControlFlowLoop.cpp.o CMakeFiles/VC4CC.dir/analysis/DataDependencyGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/DebugGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/DependencyGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/DominatorTree.cpp.o CMakeFiles/VC4CC.dir/analysis/FlagsAnalysis.cpp.o CMakeFiles/VC4CC.dir/analysis/InterferenceGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/LifetimeGraph.cpp.o CMakeFiles/VC4CC.dir/analysis/LivenessAnalysis.cpp.o CMakeFiles/VC4CC.dir/analysis/MemoryAnalysis.cpp.o CMakeFiles/VC4CC.dir/analysis/PatternMatching.cpp.o CMakeFiles/VC4CC.dir/analysis/RegisterAnalysis.cpp.o CMakeFiles/VC4CC.dir/analysis/ValueRange.cpp.o CMakeFiles/VC4CC.dir/analysis/WorkItemAnalysis.cpp.o CMakeFiles/VC4CC.dir/asm/ALUInstruction.cpp.o CMakeFiles/VC4CC.dir/asm/BranchInstruction.cpp.o CMakeFiles/VC4CC.dir/asm/CodeGenerator.cpp.o CMakeFiles/VC4CC.dir/asm/GraphColoring.cpp.o CMakeFiles/VC4CC.dir/asm/Instruction.cpp.o CMakeFiles/VC4CC.dir/asm/KernelInfo.cpp.o CMakeFiles/VC4CC.dir/asm/LoadInstruction.cpp.o CMakeFiles/VC4CC.dir/asm/OpCodes.cpp.o CMakeFiles/VC4CC.dir/asm/RegisterFixes.cpp.o CMakeFiles/VC4CC.dir/asm/SemaphoreInstruction.cpp.o CMakeFiles/VC4CC.dir/intermediate/Branching.cpp.o CMakeFiles/VC4CC.dir/intermediate/Helper.cpp.o CMakeFiles/VC4CC.dir/intermediate/Instruction.cpp.o CMakeFiles/VC4CC.dir/intermediate/LoadImmediate.cpp.o CMakeFiles/VC4CC.dir/intermediate/MemoryInstruction.cpp.o CMakeFiles/VC4CC.dir/intermediate/MethodCall.cpp.o CMakeFiles/VC4CC.dir/intermediate/Operations.cpp.o CMakeFiles/VC4CC.dir/intermediate/Synchronization.cpp.o CMakeFiles/VC4CC.dir/intermediate/TypeConversions.cpp.o CMakeFiles/VC4CC.dir/intermediate/VectorHelper.cpp.o CMakeFiles/VC4CC.dir/intrinsics/Comparisons.cpp.o CMakeFiles/VC4CC.dir/intrinsics/Images.cpp.o CMakeFiles/VC4CC.dir/intrinsics/Intrinsics.cpp.o CMakeFiles/VC4CC.dir/intrinsics/Operators.cpp.o CMakeFiles/VC4CC.dir/intrinsics/WorkItems.cpp.o CMakeFiles/VC4CC.dir/llvm/BitcodeReader.cpp.o CMakeFiles/VC4CC.dir/llvm/LLVMInstruction.cpp.o CMakeFiles/VC4CC.dir/normalization/AddressCalculation.cpp.o CMakeFiles/VC4CC.dir/normalization/Inliner.cpp.o CMakeFiles/VC4CC.dir/normalization/LiteralValues.cpp.o CMakeFiles/VC4CC.dir/normalization/LongOperations.cpp.o CMakeFiles/VC4CC.dir/normalization/MemoryAccess.cpp.o CMakeFiles/VC4CC.dir/normalization/MemoryMapChecks.cpp.o CMakeFiles/VC4CC.dir/normalization/MemoryMappings.cpp.o CMakeFiles/VC4CC.dir/normalization/Normalizer.cpp.o CMakeFiles/VC4CC.dir/normalization/Rewrite.cpp.o CMakeFiles/VC4CC.dir/optimization/Combiner.cpp.o CMakeFiles/VC4CC.dir/optimization/ControlFlow.cpp.o CMakeFiles/VC4CC.dir/optimization/Eliminator.cpp.o CMakeFiles/VC4CC.dir/optimization/Flags.cpp.o CMakeFiles/VC4CC.dir/optimization/LocalCompression.cpp.o CMakeFiles/VC4CC.dir/optimization/Memory.cpp.o CMakeFiles/VC4CC.dir/optimization/Optimizer.cpp.o CMakeFiles/VC4CC.dir/optimization/Peephole.cpp.o CMakeFiles/VC4CC.dir/optimization/Reordering.cpp.o CMakeFiles/VC4CC.dir/optimization/InstructionScheduler.cpp.o CMakeFiles/VC4CC.dir/optimization/Vector.cpp.o CMakeFiles/VC4CC.dir/periphery/CacheEntry.cpp.o CMakeFiles/VC4CC.dir/periphery/RegisterLoweredMemory.cpp.o CMakeFiles/VC4CC.dir/periphery/SFU.cpp.o CMakeFiles/VC4CC.dir/periphery/TMU.cpp.o CMakeFiles/VC4CC.dir/periphery/VPM.cpp.o CMakeFiles/VC4CC.dir/precompilation/ClangLibrary.cpp.o CMakeFiles/VC4CC.dir/precompilation/FrontendCompiler.cpp.o CMakeFiles/VC4CC.dir/precompilation/LLVMLibrary.cpp.o CMakeFiles/VC4CC.dir/precompilation/Precompiler.cpp.o CMakeFiles/VC4CC.dir/precompilation/TemporaryFile.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVBuiltins.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVHelper.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVLexer.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVOperation.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVParserBase.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVToolsParser.cpp.o CMakeFiles/VC4CC.dir/tools/Emulator.cpp.o CMakeFiles/VC4CC.dir/tools/options.cpp.o  -Wl,-rpath,:::::::::::::: -latomic ../cpplog/src/cpplog-project-build/libcpplog-static.a -lpthread -L /usr/lib/llvm-10/lib /usr/lib/llvm-10/lib/libLLVM.so ../_deps/vc4asm-build/libvc4asm.a

Yes there are some *.a links at the end of the file:

CMakeFiles/VC4CC.dir/spirv/SPIRVParserBase.cpp.o CMakeFiles/VC4CC.dir/spirv/SPIRVToolsParser.cpp.o CMakeFiles/VC4CC.dir/tools/Emulator.cpp.o CMakeFiles/VC4CC.dir/tools/options.cpp.o  -Wl,-rpath,:::::::::::::: -latomic ../cpplog/src/cpplog-project-build/libcpplog-static.a -lpthread -L /usr/lib/llvm-10/lib /usr/lib/llvm-10/lib/libLLVM.so ../_deps/vc4asm-build/libvc4asm.a
doe300 commented 2 years ago

Not sure why this never occurred on my machines, but it looks like the problem is that the vc4asm dependency is compiled without -fPIC and therefore cannot be linked into a shared library.

There are several ways this could be circumvented:

  1. Rebuild VC4C with the CMake flag -DVERIFY_OUTPUT=OFF, this will skip including the (purely optional) vc4asm library
  2. In cmake/vc4asm.cmake disable the if-block, e.g. by replacing it with if(FALSE), this uses the else()-block which explicitly sets -fPIC for the dependency build
PaxJaromeMalues commented 2 years ago

Not sure why this never occurred on my machines, but it looks like the problem is that the vc4asm dependency is compiled without -fPIC and therefore cannot be linked into a shared library.

There are several ways this could be circumvented:

1. Rebuild VC4C with the CMake flag `-DVERIFY_OUTPUT=OFF`, this will skip including the (purely optional) vc4asm library

2. In `cmake/vc4asm.cmake` disable the `if`-block, e.g. by replacing it with `if(FALSE)`, this uses the `else()`-block which explicitly sets `-fPIC` for the dependency build

I will try number 2 and will be back in a bit :) Thanks alot for your help!

PaxJaromeMalues commented 2 years ago

Number 2 Worked! VC4CL also build corrently after this. Thanks alot!

PaxJaromeMalues commented 2 years ago

Aww. It build properly this time, but my executeable apparently vails to load CL runtime. Welp.

doe300 commented 2 years ago

Are you loading the VC4CL library directly or via the system ICD loader? Also, are you running your application as root? VC4CL requires this to access some hardware registers...

PaxJaromeMalues commented 2 years ago

I tried multiple ways. First I tried to let the executeable detect it by itself, which did result in: I can not find anything so it got to be disabled. Then I defined VC4CL, which resultewd in something along the lines of: I can not load this Runtime library. In the last attempt I tried to link to multiple library objects in /usr/local/lib/ directly (all of those beingf files that got something to do with VC4C*. Resulted in the same error as before: can not load runtime library. Might just be, that the executeable itself is somehoe malfunct but I did rebuild it from base before the tests, so IDK.

doe300 commented 2 years ago

Your application probably requires the clXYZ symbols. To provide these with VC4CL, you will either need to build VC4CL with the CMake flag -DBUILD_ICD=OFF and then link directly against (or LD_PRELOAD) the provided libOpenCL.so (in <VC4CL/build/path>/src/libOpenCL.so). Or you need install the system ICD package (e.g. ocl-icd-opencl-dev, this is the usual way anyway) which acts as a layer between OpenCL clients and different implementations, depending on what is available. If VC4CL was properly installed, there will be a file /etc/OpenCL/vendors/VC4CL.icd which the ICD driver uses to find the VC4CL library.

PaxJaromeMalues commented 2 years ago

I decided to staart from scratch with a clean install of raspbian os. Following the initial steps of this issue chain. Currently userland is building in the bg, clinfo was build from source and /etc/OpenCL/vendors/VC4CL.icd is pointing to /usr/local/lib/libVC4CL.so lets see where this takes me...

PaxJaromeMalues commented 2 years ago

Currently after:

  1. installing prereqs
  2. building userland
  3. building vc4cl
  4. building clinfo

I am stuck on this:

admjpjuergens@rpisrv1:~ $ sudo clinfo --offline
[E] Fri Feb 11 15:35:55 2022: Received signal: SIGSEGV
[E] Fri Feb 11 15:35:55 2022:  (1) /usr/local/lib/libVC4CC.so.1.2 : +0xb1e4d0 [0x764ef4d0]
[E] Fri Feb 11 15:35:55 2022:  (2) /lib/arm-linux-gnueabihf/libc.so.6 : __default_rt_sa_restorer+0 [0x76e15db0]
admjpjuergens@rpisrv1:~ $ sudo clinfo -l
[E] Fri Feb 11 15:36:03 2022: Received signal: SIGSEGV
[E] Fri Feb 11 15:36:03 2022:  (1) /usr/local/lib/libVC4CC.so.1.2 : +0xb1e4d0 [0x764744d0]
[E] Fri Feb 11 15:36:03 2022:  (2) /lib/arm-linux-gnueabihf/libc.so.6 : __default_rt_sa_restorer+0 [0x76d9adb0]
admjpjuergens@rpisrv1:~ $ sudo clinfo
[E] Fri Feb 11 15:36:10 2022: Received signal: SIGSEGV
[E] Fri Feb 11 15:36:10 2022:  (1) /usr/local/lib/libVC4CC.so.1.2 : +0xb1e4d0 [0x764c84d0]
[E] Fri Feb 11 15:36:10 2022:  (2) /lib/arm-linux-gnueabihf/libc.so.6 : __default_rt_sa_restorer+0 [0x76deedb0]

FYI the old clinfo did simply say: Number of Plattforms: 0

doe300 commented 2 years ago

The stack trace is not very helpful... Can you recompile VC4C with the signals.cpp in https://github.com/doe300/VC4C/blob/master/src/sources.list#L17 commented out/removed and see what happens?

Also there should be no need to build userland yourself, these commands work fine on a new Raspberry Pi Bullseye.

PaxJaromeMalues commented 2 years ago

The stack trace is not very helpful... Can you recompile VC4C with the signals.cpp in https://github.com/doe300/VC4C/blob/master/src/sources.list#L17 commented out/removed and see what happens?

Also there should be no need to build userland yourself, these commands work fine on a new Raspberry Pi Bullseye.

Sorry I didn't want to bother you with an of this. VC4C and VC4CL are being currently rebuild as per your instructions.

PaxJaromeMalues commented 2 years ago
[ 91%] Linking CXX shared library libVC4CC.so
/usr/bin/ld: BFD (GNU Binutils for Raspbian) 2.35.2 internal error, aborting at ../../bfd/merge.c:939 in _bfd_merged_section_offset

/usr/bin/ld: Please report this bug.

collect2: error: ld returned 1 exit status
make[2]: *** [src/CMakeFiles/VC4CC.dir/build.make:1575: src/libVC4CC.so.0.4.9999] Error 1
make[2]: *** Deleting file 'src/libVC4CC.so.0.4.9999'
make[1]: *** [CMakeFiles/Makefile2:2065: src/CMakeFiles/VC4CC.dir/all] Error 2
make: *** [Makefile:182: all] Error 2
doe300 commented 2 years ago

Curiouser and curiouser... Did you maybe run out of memory while linking?

PaxJaromeMalues commented 2 years ago

Curiouser and curiouser... Did you maybe run out of memory while linking?

System did not note anything alike that. currently 384MB are dedicated to VRAM ~640MB to RAM. Any idea what further information I can provide?

PaxJaromeMalues commented 2 years ago

u know what for now I will lower VRAM to 16MB and try again, who knows

PaxJaromeMalues commented 2 years ago

It continued past the previous error position. Either there was not enough memory/swap or me cleaning the build folder with rm -rf did it.

One question regarding your built instructions:

git clone https://github.com/doe300/VC4CLStdLib.git
git clone https://github.com/doe300/VC4CL.git
git clone https://github.com/doe300/VC4C.git
cd VC4C
mkdir build
cd build/
cmake ..
make -j2
cd ../../VC4CL
mkdir build
cd build/
cmake -DBUILD_ICD=OFF ../
make -j2

what happend to sudo make install and sudo ldconfig?

doe300 commented 2 years ago

what happend to sudo make install and sudo ldconfig?

These steps are only necessary, if you want to install it system-wide (which if you want to use the VC4CL implementation without relinking all client applications against it, you probably want to do). I probably should update the build instructions with that.

PaxJaromeMalues commented 2 years ago

Results:

~/buildarea/opencl/VC4CL/VC4CL/build $ LD_PRELOAD=src/libVC4CL.so clinfo
Number of platforms                               1
  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
  Platform Vendor                                 doe300
  Platform Version                                OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Platform Profile                                EMBEDDED_PROFILE
  Platform Extensions                             cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters

  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices                                 1
  Device Name                                     VideoCore IV GPU
  Device Vendor                                   Broadcom
  Device Vendor ID                                0x14e4
  Device Version                                  OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Device Numeric Version                          0x402000 (1.2.0)
  Driver Version                                  0.4.9999
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Profile                                  EMBEDDED_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Available core IDs                              0
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to initialize VCSM!
Aborted
~ $ LD_PRELOAD=libVC4CL.so clinfo
Number of platforms                               1
  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
  Platform Vendor                                 doe300
  Platform Version                                OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Platform Profile                                EMBEDDED_PROFILE
  Platform Extensions                             cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters

  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices                                 1
  Device Name                                     VideoCore IV GPU
  Device Vendor                                   Broadcom
  Device Vendor ID                                0x14e4
  Device Version                                  OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Device Numeric Version                          0x402000 (1.2.0)
  Driver Version                                  0.4.9999
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Profile                                  EMBEDDED_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Available core IDs                              0
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to initialize VCSM!
Aborted
~ $ sudo LD_PRELOAD=libVC4CL.so clinfo
Number of platforms                               1
  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
  Platform Vendor                                 doe300
  Platform Version                                OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Platform Profile                                EMBEDDED_PROFILE
  Platform Extensions                             cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters

  Platform Name                                   OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices                                 1
  Device Name                                     VideoCore IV GPU
  Device Vendor                                   Broadcom
  Device Vendor ID                                0x14e4
  Device Version                                  OpenCL 1.2 VC4CL 0.4.9999 (2bed01f)
  Device Numeric Version                          0x402000 (1.2.0)
  Driver Version                                  0.4.9999
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Profile                                  EMBEDDED_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Available core IDs                              0
  Max clock frequency                             250MHz
  Core Temperature (Altera)                       39 C
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             12x12x12
  Max work group size                             12
  Preferred work group size multiple (kernel)     1
  Preferred / native vector sizes
    char                                                16 / 16
    short                                               16 / 16
    int                                                 16 / 16
    long                                                 0 / 0
    half                                                 0 / 0        (n/a)
    float                                               16 / 16
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 Yes
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    32, Little-Endian
  Global memory size                              16777216 (16MiB)
  Error Correction support                        No
  Max memory allocation                           8388608 (8MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             64 bytes
  Alignment of base address                       512 bits (64 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768 (32KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   No
  Local memory type                               Global
  Local memory size                               16777216 (16MiB)
  Max number of constant args                     32
  Max constant buffer size                        16777216 (16MiB)
  Max size of kernel argument                     256
  Queue properties
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    IL version                                    SPIR-V_1.5 SPIR_1.2
    ILs with version                              SPIR                                                             0x402000 (1.2.0)
                                                  SPIR-V                                                           0x405000 (1.5.0)
    SPIR versions                                 1.2
  printf() buffer size                            0
  Built-in kernels                                (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_nv_pragma_unroll cl_arm_core_id cl_ext_atomic_counters_32 cl_khr_initialize_memory cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16 cl_arm_integer_dot_product_accumulate_saturate_int8 cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_extended_versioning cl_khr_spirv_no_integer_wrap_decoration cl_khr_suggested_local_work_size cl_vc4cl_performance_counters
  Device Extensions with Version                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                                     0 (0.0.0)
                                                  cl_arm_core_id                                                   0x800000 (2.0.0)
                                                  cl_ext_atomic_counters_32                                        0x1400000 (5.0.0)
                                                  cl_khr_initialize_memory                                         0x400000 (1.0.0)
                                                  cl_arm_integer_dot_product_int8                                  0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_int8                       0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_int16                      0xc00000 (3.0.0)
                                                  cl_arm_integer_dot_product_accumulate_saturate_int8              0xc00000 (3.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_khr_spir                                                      0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_altera_device_temperature                                            0 (0.0.0)
                                                  cl_altera_live_object_tracking                                          0 (0.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)
                                                  cl_khr_spirv_no_integer_wrap_decoration                                 0 (0.0.0)
                                                  cl_khr_suggested_local_work_size                                 0x400000 (1.0.0)
                                                  cl_vc4cl_performance_counters                                           0 (0.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  OpenCL for the Raspberry Pi VideoCore IV GPU
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [P0]
  clCreateContext(NULL, ...) [default]            Success [P0]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 OpenCL for the Raspberry Pi VideoCore IV GPU
    Device Name                                   VideoCore IV GPU
PaxJaromeMalues commented 2 years ago

If I run clinfo without sudo and LD-PRELOAD I only get a segmentation fault.

doe300 commented 2 years ago

sudo is required since there is no MMU thus the GPU can access all memory and that is bad for security, see here.

Whether you need the LD_PRELOAD depends on how you build VC4CL:

Basically usage with ICD loader and direct usage are mutually exclusive. Whichever you need depends on your use-case. If you want to install VC4CL system-wide you probably want to build VC4CL with -DBUILD_ICD=ON and use the ICD loader.

PaxJaromeMalues commented 2 years ago

You will def not like this (and I am starting to feel really sorry about coming back here), but within a clean debian bullseye Raspberry Pi OS following your wiki built instructions cmake fails on the first step for VC4C. Here is the output: (pls do not kill me)

user@rpisrv1:~/buildarea/opencl/vc4cl/VC4C/build $ cmake ..
CMake Deprecation Warning at CMakeLists.txt:4 (cmake_policy):
  The OLD behavior for policy CMP0026 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.

-- VC4CL standard library headers found: /home/user/buildarea/opencl/vc4cl/VC4C/../VC4CLStdLib/include/
-- The C compiler identification is Clang 11.0.1
-- The CXX compiler identification is Clang 11.0.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc - broken
CMake Error at /usr/share/cmake-3.18/Modules/CMakeTestCCompiler.cmake:66 (message):
  The C compiler

    "/usr/bin/cc"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeTmp

    Run Build Command(s):/usr/bin/gmake cmTC_7cf1b/fast && /usr/bin/gmake  -f CMakeFiles/cmTC_7cf1b.dir/build.make CMakeFiles/cmTC_7cf1b.dir/build
    gmake[1]: Entering directory '/home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeTmp'
    Building C object CMakeFiles/cmTC_7cf1b.dir/testCCompiler.c.o
    /usr/bin/cc    -o CMakeFiles/cmTC_7cf1b.dir/testCCompiler.c.o -c /home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeTmp/testCCompiler.c
    Linking C executable cmTC_7cf1b
    /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_7cf1b.dir/link.txt --verbose=1
    /usr/bin/cc -rdynamic CMakeFiles/cmTC_7cf1b.dir/testCCompiler.c.o -o cmTC_7cf1b
    clang: error: unable to execute command: Executable "ld" doesn't exist!
    clang: error: linker command failed with exit code 1 (use -v to see invocation)
    gmake[1]: *** [CMakeFiles/cmTC_7cf1b.dir/build.make:106: cmTC_7cf1b] Error 1
    gmake[1]: Leaving directory '/home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeTmp'
    gmake: *** [Makefile:140: cmTC_7cf1b/fast] Error 2

  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:52 (project)

-- Configuring incomplete, errors occurred!
See also "/home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeOutput.log".
See also "/home/user/buildarea/opencl/vc4cl/VC4C/build/CMakeFiles/CMakeError.log".
PaxJaromeMalues commented 2 years ago

FYI: lrwxrwxrwx 1 root root 20 Feb 19 07:53 /usr/bin/cc -> /etc/alternatives/cc lrwxrwxrwx 1 root root 14 Feb 19 07:53 /etc/alternatives/cc -> /usr/bin/clang lrwxrwxrwx 1 root root 24 May 12 2021 /usr/bin/clang -> ../lib/llvm-11/bin/clang so the packages seem to be installed properly. Clean Debian OS ships with gcc-10. So that should be available too.

doe300 commented 2 years ago
clang: error: unable to execute command: Executable "ld" doesn't exist!

That seems to be the underlying error. Did you install the requirements listed in here? Since they should provide the ld linker program (via a dependency of the build-essential package).

PaxJaromeMalues commented 2 years ago
clang: error: unable to execute command: Executable "ld" doesn't exist!

That seems to be the underlying error. Did you install the requirements listed in here? Since they should provide the ld linker program (via a dependency of the build-essential package).

Yes I did. Copied the instructions straight from the wiki to make sure I did not fuck up.

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
build-essential is already the newest version (12.9).
cmake is already the newest version (3.18.4-2+rpi1+deb11u1).
git is already the newest version (1:2.30.2-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Package: build-essential
Version: 12.9
Priority: optional
Section: devel
Maintainer: Matthias Klose <doko@debian.org>
Installed-Size: 20.5 kB
Depends: libc6-dev | libc-dev, gcc (>= 4:10.2), g++ (>= 4:10.2), make, dpkg-dev (>= 1.17.11)
Download-Size: 7,704 B
APT-Manual-Installed: yes
APT-Sources: http://raspbian.raspberrypi.org/raspbian bullseye/main armhf Packages

Linux rpisrv1 5.10.92-v7+ #1514 armv7l GNU/Linux

PaxJaromeMalues commented 2 years ago

Okay... ld IS installed tho. via binutils

Version: 2.35.2-2+rpi1
Priority: optional
Section: devel
Maintainer: Matthias Klose <doko@debian.org>
Installed-Size: 100 kB
Provides: binutils-gold, elf-binutils
Depends: binutils-common (= 2.35.2-2+rpi1), libbinutils (= 2.35.2-2+rpi1), binutils-arm-linux-gnueabihf:any (= 2.35.2-2+rpi1)
Suggests: binutils-doc (>= 2.35.2-2+rpi1)
Conflicts: binutils-multiarch (<< 2.27-8), modutils (<< 2.4.19-1)
Homepage: https://www.gnu.org/software/binutils/
Download-Size: 61.7 kB
APT-Manual-Installed: no
APT-Sources: http://raspbian.raspberrypi.org/raspbian bullseye/main armhf Packages
Description: GNU assembler, linker and binary utilities
 The programs in this package are used to assemble, link and manipulate
 binary and object files.  They may be used in conjunction with a compiler
 and various libraries to build programs.

lrwxrwxrwx 1 root root 22 Mar 6 2021 /bin/ld -> arm-linux-gnueabihf-ld lrwxrwxrwx 1 root root 22 Mar 6 2021 /usr/bin/ld -> arm-linux-gnueabihf-ld -rwxr-xr-x 1 root root 703680 Mar 6 2021 /usr/bin/arm-linux-gnueabihf-ld.bfd -rwxr-xr-x 1 root root 4505052 Mar 6 2021 /usr/bin/arm-linux-gnueabihf-ld.gold

Just to make sure we are on the same page, I installed the OS with the following image: https://downloads.raspberrypi.org/raspios_lite_armhf/images/raspios_lite_armhf-2022-01-28/2022-01-28-raspios-bullseye-armhf-lite.zip using the Raspberry Pi Imager by selecting manual image file as the default selection kept failing. (which is no issue as it does install the same image type, downloading and caching the latest stable)

doe300 commented 2 years ago

Yeah that seems okay, I also tested it with Raspberry Pi OS Bullseye Lite.

Why clang can't find ld I have no idea, can you maybe check and try the instructions from here? Other than that, do you know why clang is your default compiler? Did you set this up? Did you maybe install clang before GCC/build-essential?

Can you build other CMake/C++ projects (e.g. clpeak)?

PaxJaromeMalues commented 2 years ago

Yeah that seems okay, I also tested it with Raspberry Pi OS Bullseye Lite.

Why clang can't find ld I have no idea, can you maybe check and try the instructions from here? Other than that, do you know why clang is your default compiler? Did you set this up? Did you maybe install clang before GCC/build-essential?

Can you build other CMake/C++ projects (e.g. clpeak)?

I will hit the bed right now. But I will try all this out by tomorrow and properly document the results for you.

PaxJaromeMalues commented 2 years ago

Other than that, do you know why clang is your default compiler? Did you set this up? Did you maybe install clang before GCC/build-essential?

No I have no idea why. I guess its OS setup default maybe? I did all apt installs in the order written in your wiki.