pocl / pocl

pocl - Portable Computing Language
https://portablecl.org
MIT License
924 stars 252 forks source link

make check fails when compiled with ENABLE_ICD=On #1283

Open 222464 opened 1 year ago

222464 commented 1 year ago

Hello,

I tried building the 4.0 release as well as the latest in the main branch. All tests from make check pass when ENABLE_ICD=Off. With ENABLE_ICD=On, (almost) all of the tests fail.

5% tests passed, 249 tests failed out of 262

Label Time Summary:
EinsteinToolkit    =   1.41 sec*proc (2 tests)
cuda               =   0.54 sec*proc (42 tests)
dlopen             =   0.04 sec*proc (3 tests)
hsa                =   0.05 sec*proc (4 tests)
hsa-native         =   1.82 sec*proc (82 tests)
internal           =   6.76 sec*proc (254 tests)
kernel             =   3.24 sec*proc (76 tests)
level0             =   2.70 sec*proc (125 tests)
matrix             =   0.09 sec*proc (4 tests)
poclbin            =   0.05 sec*proc (4 tests)
proxy              =   1.16 sec*proc (36 tests)
regression         =   1.03 sec*proc (95 tests)
runtime            =   0.29 sec*proc (29 tests)
tce                =   0.10 sec*proc (9 tests)
vulkan             =   0.30 sec*proc (26 tests)
workgroup          =   0.40 sec*proc (31 tests)

Total Test time (real) =   1.72 sec

The following tests did not run:
        262 - EinsteinToolkit_SubDev (Skipped)

The following tests FAILED:
          1 - pocl_version_check (Failed)
          5 - kernel/test_as_type_loopvec (Failed)
          6 - kernel/test_as_type_cbs (Failed)
          7 - kernel/test_convert_type_1_loopvec (Failed)
          8 - kernel/test_convert_type_1_cbs (Failed)
          9 - kernel/test_convert_type_2_loopvec (Failed)
         10 - kernel/test_convert_type_2_cbs (Failed)
         11 - kernel/test_convert_type_4_loopvec (Failed)
         12 - kernel/test_convert_type_4_cbs (Failed)
         13 - kernel/test_convert_type_8_loopvec (Failed)
         14 - kernel/test_convert_type_8_cbs (Failed)
         15 - kernel/test_convert_type_16_loopvec (Failed)
         16 - kernel/test_convert_type_16_cbs (Failed)
         17 - kernel/test_bitselect_loopvec (Failed)
         18 - kernel/test_bitselect_cbs (Failed)
         19 - kernel/test_hadd_loops (Failed)
         20 - kernel/test_hadd_loopvec (Failed)
         21 - kernel/test_hadd_cbs (Failed)
         22 - kernel/test_min_max_loopvec (Failed)
         23 - kernel/test_min_max_cbs (Failed)
         24 - kernel/test_length_distance_loopvec (Failed)
         25 - kernel/test_length_distance_cbs (Failed)
         26 - kernel/test_fmin_fmax_fma_loopvec (Failed)
         27 - kernel/test_fmin_fmax_fma_cbs (Failed)
         28 - kernel/test_local_struct_array_loopvec (Failed)
         29 - kernel/test_local_struct_array_cbs (Failed)
         30 - kernel/test_convert_sat_regression_loopvec (Failed)
         31 - kernel/test_convert_sat_regression_cbs (Failed)
         32 - kernel/test_rotate_loopvec (Failed)
         33 - kernel/test_rotate_cbs (Failed)
         34 - kernel/test_fabs_loopvec (Failed)
         35 - kernel/test_fabs_cbs (Failed)
         36 - kernel/test_copy_signbit_loopvec (Failed)
         37 - kernel/test_copy_signbit_cbs (Failed)
         38 - kernel/test_ilogb_loopvec (Failed)
         39 - kernel/test_ilogb_cbs (Failed)
         40 - kernel/test_ldexp_loopvec (Failed)
         41 - kernel/test_ldexp_cbs (Failed)
         42 - kernel/test_isnan_loopvec (Failed)
         43 - kernel/test_isnan_cbs (Failed)
         44 - kernel/test_short16_loopvec (Failed)
         45 - kernel/test_short16_cbs (Failed)
         46 - kernel/test_frexp_modf_loopvec (Failed)
         47 - kernel/test_frexp_modf_cbs (Failed)
         48 - kernel/test_sampler_address_clamp_loopvec (Failed)
         49 - kernel/test_sampler_address_clamp_cbs (Failed)
         50 - kernel/test_image_query_funcs_loopvec (Failed)
         51 - kernel/test_image_query_funcs_cbs (Failed)
         52 - kernel/test_shuffle_char_loopvec (Failed)
         53 - kernel/test_shuffle_char_cbs (Failed)
         54 - kernel/test_shuffle_short_loopvec (Failed)
         55 - kernel/test_shuffle_short_cbs (Failed)
         56 - kernel/test_shuffle_ushort_loopvec (Failed)
         57 - kernel/test_shuffle_ushort_cbs (Failed)
         58 - kernel/test_shuffle_int_loopvec (Failed)
         59 - kernel/test_shuffle_int_cbs (Failed)
         60 - kernel/test_shuffle_uint_loopvec (Failed)
         61 - kernel/test_shuffle_uint_cbs (Failed)
         62 - kernel/test_shuffle_half_loopvec (Failed)
         63 - kernel/test_shuffle_half_cbs (Failed)
         64 - kernel/test_shuffle_float_loopvec (Failed)
         65 - kernel/test_shuffle_float_cbs (Failed)
         66 - kernel/test_shuffle_double_loopvec (Failed)
         67 - kernel/test_shuffle_double_cbs (Failed)
         68 - kernel/test_shuffle_long_loopvec (Failed)
         69 - kernel/test_shuffle_long_cbs (Failed)
         70 - kernel/test_shuffle_ulong_loopvec (Failed)
         71 - kernel/test_shuffle_ulong_cbs (Failed)
         72 - kernel/test_ucharn_loopvec (Failed)
         73 - kernel/test_ucharn_cbs (Failed)
         74 - kernel/test_printf_loopvec (Failed)
         75 - kernel/test_printf_cbs (Failed)
         80 - kernel/test_sizeof_uint_loopvec (Failed)
         81 - kernel/test_sizeof_uint_cbs (Failed)
         82 - regression/test_issue_231_loopvec (Failed)
         83 - regression/test_issue_231_cbs (Failed)
         84 - regression/test_issue_445_loopvec (Failed)
         85 - regression/test_issue_445_cbs (Failed)
         86 - regression/test_issue_553_loopvec (Failed)
         87 - regression/test_issue_553_cbs (Failed)
         88 - regression/test_issue_577_loopvec (Failed)
         89 - regression/test_issue_577_cbs (Failed)
         92 - regression/test_workitem_func_outside_kernel_loopvec (Failed)
         93 - regression/test_workitem_func_outside_kernel_cbs (Failed)
         94 - regression/test_program_scope_vars (Failed)
         95 - regression/test_llvm_segfault_issue_889_loopvec (Failed)
         96 - regression/test_llvm_segfault_issue_889_cbs (Failed)
         99 - regression/test_flatten_barrier_subs_loopvec (Failed)
        100 - regression/test_flatten_barrier_subs_cbs (Failed)
        101 - regression/phi_nodes_not_replicated_loopvec (Failed)
        102 - regression/phi_nodes_not_replicated_cbs (Failed)
        103 - regression/phi_nodes_not_replicated_repl (Failed)
        104 - regression/issues_with_local_pointers_loopvec (Failed)
        105 - regression/issues_with_local_pointers_cbs (Failed)
        106 - regression/issues_with_local_pointers_repl (Failed)
        107 - regression/barrier_between_two_for_loops_loopvec (Failed)
        108 - regression/barrier_between_two_for_loops_cbs (Failed)
        109 - regression/barrier_between_two_for_loops_repl (Failed)
        110 - regression/simple_for-loop_with_a_barrier_inside_loopvec (Failed)
        111 - regression/simple_for-loop_with_a_barrier_inside_cbs (Failed)
        112 - regression/simple_for-loop_with_a_barrier_inside_repl (Failed)
        113 - regression/for-loop_with_computation_after_the_brexit_loopvec (Failed)
        114 - regression/for-loop_with_computation_after_the_brexit_cbs (Failed)
        115 - regression/for-loop_with_computation_after_the_brexit_repl (Failed)
        116 - regression/for-loop_with_a_variable_iteration_count_loopvec (Failed)
        117 - regression/for-loop_with_a_variable_iteration_count_cbs (Failed)
        118 - regression/for-loop_with_a_variable_iteration_count_repl (Failed)
        119 - regression/early_return_before_a_barrier_region_loopvec (Failed)
        120 - regression/early_return_before_a_barrier_region_cbs (Failed)
        121 - regression/early_return_before_a_barrier_region_repl (Failed)
        122 - regression/id-dependent_computation_before_kernel_exit_loopvec (Failed)
        123 - regression/id-dependent_computation_before_kernel_exit_cbs (Failed)
        124 - regression/id-dependent_computation_before_kernel_exit_repl (Failed)
        125 - regression/barrier_just_before_return_loopvec (Failed)
        126 - regression/barrier_just_before_return_cbs (Failed)
        127 - regression/barrier_just_before_return_repl (Failed)
        128 - regression/infinite_loop_loopvec (Failed)
        129 - regression/infinite_loop_cbs (Failed)
        130 - regression/infinite_loop_repl (Failed)
        131 - regression/undominated_variable_from_conditional_barrier_handling_loopvec (Failed)
        132 - regression/undominated_variable_from_conditional_barrier_handling_cbs (Failed)
        133 - regression/undominated_variable_from_conditional_barrier_handling_repl (Failed)
        134 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_loopvec (Failed)
        135 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_cbs (Failed)
        136 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_repl (Failed)
        137 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_2_loopvec (Failed)
        138 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_2_cbs (Failed)
        139 - regression/assigning_a_loop_iterator_variable_to_a_private_makes_it_local_2_repl (Failed)
        140 - regression/test_program_from_binary_with_local_1_1_1_loopvec (Failed)
        141 - regression/test_program_from_binary_with_local_1_1_1_cbs (Failed)
        142 - regression/test_program_from_binary_with_local_1_1_1_repl (Failed)
        143 - regression/test_alignment_with_dynamic_wg_114_loopvec (Failed)
        144 - regression/test_alignment_with_dynamic_wg_114_cbs (Failed)
        145 - regression/test_alignment_with_dynamic_wg_117_loopvec (Failed)
        146 - regression/test_alignment_with_dynamic_wg_117_cbs (Failed)
        147 - regression/test_alignment_with_dynamic_wg_225_loopvec (Failed)
        148 - regression/test_alignment_with_dynamic_wg_225_cbs (Failed)
        149 - regression/test_alignment_with_dynamic_wg_173_loopvec (Failed)
        150 - regression/test_alignment_with_dynamic_wg_173_cbs (Failed)
        151 - regression/test_alignment_with_dynamic_wg_183_loopvec (Failed)
        152 - regression/test_alignment_with_dynamic_wg_183_cbs (Failed)
        153 - regression/test_alignment_with_dynamic_wg_283_loopvec (Failed)
        154 - regression/test_alignment_with_dynamic_wg_283_cbs (Failed)
        155 - regression/test_alignment_with_dynamic_wg_332_loopvec (Failed)
        156 - regression/test_alignment_with_dynamic_wg_332_cbs (Failed)
        157 - regression/test_alignment_with_dynamic_wg_323_loopvec (Failed)
        158 - regression/test_alignment_with_dynamic_wg_323_cbs (Failed)
        159 - regression/test_alignment_with_dynamic_wg2_loopvec (Failed)
        160 - regression/test_alignment_with_dynamic_wg2_cbs (Failed)
        161 - regression/test_alignment_with_dynamic_wg3_loopvec (Failed)
        162 - regression/test_alignment_with_dynamic_wg3_cbs (Failed)
        163 - regression/setting_a_buffer_argument_to_NULL_causes_a_segfault_loopvec (Failed)
        164 - regression/setting_a_buffer_argument_to_NULL_causes_a_segfault_cbs (Failed)
        165 - regression/clSetKernelArg_overwriting_the_previous_kernel's_args_loopvec (Failed)
        166 - regression/clSetKernelArg_overwriting_the_previous_kernel's_args_cbs (Failed)
        167 - regression/passing_a_constant_array_as_an_arg_loopvec (Failed)
        168 - regression/passing_a_constant_array_as_an_arg_cbs (Failed)
        169 - regression/case_with_multiple_variable_length_loops_and_a_barrier_in_one_loopvec (Failed)
        170 - regression/case_with_multiple_variable_length_loops_and_a_barrier_in_one_cbs (Failed)
        171 - regression/autolocals_in_constexprs_loopvec (Failed)
        172 - regression/autolocals_in_constexprs_cbs (Failed)
        173 - regression/struct_kernel_arguments_loopvec (Failed)
        174 - regression/struct_kernel_arguments_cbs (Failed)
        175 - regression/vector_kernel_arguments_loopvec (Failed)
        176 - regression/vector_kernel_arguments_cbs (Failed)
        177 - runtime/clGetDeviceInfo (Failed)
        178 - runtime/clEnqueueNativeKernel (Failed)
        179 - runtime/clGetEventInfo (Failed)
        180 - runtime/clCreateProgramWithBinary (Failed)
        181 - runtime/clBuildProgram (Failed)
        182 - runtime/test_kernel_cache_includes (Failed)
        183 - runtime/clFinish (Failed)
        184 - runtime/test_event_cycle (Failed)
        185 - runtime/test_link_error (Failed)
        186 - runtime/test_read-copy-write-buffer (Failed)
        187 - runtime/test_fill-buffer (Failed)
        188 - runtime/test_buffer-image-copy (Failed)
        189 - runtime/clCreateKernel (Failed)
        190 - runtime/clGetKernelArgInfo (Failed)
        191 - runtime/clSetEventCallback (Failed)
        192 - runtime/clGetSupportedImageFormats (Failed)
        193 - runtime/clCreateKernelsInProgram (Failed)
        194 - runtime/clCreateSubDevices (Failed)
        195 - runtime/test_event_free (Failed)
        196 - runtime/test_event_double_wait (Failed)
        197 - runtime/test_enqueue_kernel_from_binary (Failed)
        198 - runtime/test_user_event (Failed)
        199 - runtime/test_buffer_migration (Failed)
        200 - runtime/test_buffer_ping_pong (Failed)
        201 - runtime/clSetMemObjectDestructorCallback (Failed)
        202 - runtime/test_cl_pocl_content_size (Failed)
        203 - runtime/test_deviceside_enqueue (Failed)
        204 - runtime/test_command_buffer (Failed)
        205 - runtime/test_command_buffer_images (Failed)
        206 - workgroup/different_implicit_barrier_injection_scenarios (Failed)
        207 - workgroup/unbarriered_for_loops_loopvec (Failed)
        208 - workgroup/unbarriered_for_loops_cbs (Failed)
        209 - workgroup/barriered_for_loops_loopvec (Failed)
        210 - workgroup/barriered_for_loops_cbs (Failed)
        211 - workgroup/switch_case_loopvec (Failed)
        212 - workgroup/switch_case_cbs (Failed)
        213 - workgroup/b_loop_with_none_of_the_WIs_reaching_the_barrier_loopvec (Failed)
        214 - workgroup/b_loop_with_none_of_the_WIs_reaching_the_barrier_cbs (Failed)
        215 - workgroup/for_with_divergent_return_loopvec (Failed)
        216 - workgroup/for_with_divergent_return_cbs (Failed)
        217 - workgroup/cond_barriers_in_for_loopvec (Failed)
        218 - workgroup/cond_barriers_in_for_cbs (Failed)
        219 - workgroup/cond_barrier_in_var_for (Failed)
        220 - workgroup/unconditional_barriers_loopvec (Failed)
        221 - workgroup/unconditional_barriers_cbs (Failed)
        222 - workgroup/conditional_barrier_loopvec (Failed)
        223 - workgroup/conditional_barrier_cbs (Failed)
        224 - workgroup/forcing_horizontal_parallelization_to_some_outer_loopvec (Failed)
        225 - workgroup/loop_with_two_paths_to_the_latch_loopvec (Failed)
        226 - workgroup/loop_with_two_paths_to_the_latch_cbs (Failed)
        227 - workgroup/b_loop_with_two_latches_loopvec (Failed)
        228 - workgroup/b_loop_with_two_latches_cbs (Failed)
        229 - workgroup/workgroup_sizes_work_items_get_wrong_ids_loopvec (Failed)
        230 - workgroup/workgroup_sizes_work_items_get_wrong_ids_cbs (Failed)
        231 - workgroup/issue_548_convergent_propagation_loopvec (Failed)
        232 - workgroup/issue_548_convergent_propagation_cbs (Failed)
        233 - workgroup/range_md_small_grid_loopvec (Failed)
        234 - workgroup/range_md_small_grid_cbs (Failed)
        235 - workgroup/range_md_large_grid_loopvec (Failed)
        236 - workgroup/range_md_large_grid_cbs (Failed)
        237 - examples/example0 (Failed)
        238 - examples/example0_spir (Failed)
        239 - examples/example1_dot_product (Failed)
        240 - examples/example1_spir (Failed)
        241 - examples/example1_poclbin (Failed)
        242 - examples/example2 (Failed)
        243 - examples/example2_spir (Failed)
        244 - examples/example2_poclbin (Failed)
        245 - examples/example2a (Failed)
        246 - examples/example2a_spir (Failed)
        247 - examples/example2a_poclbin (Failed)
        248 - examples/matrix1 (Failed)
        249 - examples/matrix1_local (Failed)
        250 - examples/matrix1_spir (Failed)
        251 - examples/matrix1_spir_local (Failed)
        252 - examples/matrix1_poclbin (Failed)
        253 - poclcc (Failed)
        254 - examples/scalarwave_loopvec (Failed)
        255 - examples/scalarwave_cbs (Failed)
        256 - examples/trig (Failed)
        257 - examples/vecadd (Failed)
        258 - examples/vecadd_large_grid (Failed)
        259 - examples/matadd (Failed)
        260 - examples/boxadd (Failed)
Errors while running CTest

The tests fail with:

STDERR:

  Assertion:

  err == CL_SUCCESS

  Assertion:

  err == CL_SUCCESS

  CL_INVALID_CONTEXT in run on line <line number from test>

or:

STDERR:

  CL_PLATFORM_NOT_FOUND_KHR in main on line <line number from test>

OS: Manjaro x86_64

I also tried the Khronos ICD, reinstalling ocl-icd, to no avail. clinfo does detect pocl properly, though.

Any ideas?

pjaaskel commented 1 year ago

I always use ICD and my tests pass so it shouldn't be completely busted. Maybe your clinfo finds your system (package manager installed) PoCL? Do you force ENABLE_ICD=ON or does it autodetect it successfully (do you have ocl-icd-dev installed)? Perhaps some symbol is missing in libpocl.so (use nm to check) or you have unclean build. Shooting in the dark here.

222464 commented 1 year ago

@pjaaskel I had a version of PoCL installed via the package manager before, but I uninstalled it prior to installing from the repository, since it was running very slowly and I wanted to see if I could tweak some options (didn't get to that yet though).

ENABLE_ICD=On is auto-detected (I do have ocl-icd installed).

I tried building ocl-icd myself as well, no change though.

I tried make clean and deleting the build folder and rebuilding as well.

Keep in mind that it works fine without the ICD, so core PoCL works. It must have something to do with the ICD. Another thing to note is that my AMD ROCM platform is detected and runs fine through the ICD, so it may be on PoCL's end somehow.

I checked the symbols, I wouldn't know which one to look for, but it seems to be correct (and the clCreateContext functions are there).

I have used PoCL on different machines several times in the past, but never had this issue, pretty weird.

pjaaskel commented 1 year ago

Which version of OCL ICD? Is it built against the OpenCL 3.0 headers?

Sometimes the ICD loader is overwritten by proprietary OpenCL drivers and it install an unknown ICD loader and strangeness like this can happen. Does it work better if you copy the icd file to the default search directory?

222464 commented 1 year ago

@pjaaskel It is being built against the OpenCL 3.0 headers, at least when I grep the cache it says the HAVE_OCL_ICD_30_COMPATIBLE variable is set to 1.

Here's something interesting I found in the CMake cache though, when searching for "OPENCL":

ENABLE_LIBLLVMOPENCL:BOOL=OFF
INSTALL_OPENCL_HEADERS:BOOL=OFF
OPENCL_H:FILEPATH=/usr/local/include/CL/opencl.h
OPENCL_HPP:FILEPATH=/usr/include/CL/opencl.hpp
POCL_INSTALL_OPENCL_HEADER_DIR:PATH=/usr/local/include/CL
OPENCL_FOUND:INTERNAL=1
OPENCL_LIBDIR:INTERNAL=/usr/lib
OPENCL_LIBRARIES:INTERNAL=/usr/lib/libOpenCL.so

It seems to be using different headers and libs (one usr/ and one usr/local)?