doe300 / VC4CL

OpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models
MIT License
729 stars 79 forks source link

Validation-error while running OpenCL kernel on Raspberry Pi 3 B+ #83

Closed Jiaqing-Cao closed 4 years ago

Jiaqing-Cao commented 5 years ago

Hi, when I try to use VC4CL to run a OpenCL kernel on Raspberry Pi 3 B+, I met an error :

[W] Fri Aug 23 13:05:30 2019: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail! [W] Fri Aug 23 13:05:30 2019: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail! [W] Fri Aug 23 13:05:30 2019: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail! [W] Fri Aug 23 13:05:30 2019: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail! [E] Fri Aug 23 13:05:36 2019: Validation-error 'Cannot read register ra0 because it just has been written by the previous instruction.' in: or r0, ra0, r3 [E] Fri Aug 23 13:05:36 2019: With reference to instruction: or.ifz ra0, r0, r0 [E] Fri Aug 23 13:05:36 2019: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&)+0xf4 [0x741efd40] [E] Fri Aug 23 13:28:41 2019: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x656358 [0x743c2358] [E] Fri Aug 23 13:28:41 2019: (3) /usr/local/lib/libVC4CC.so.1.2 : +0x65885c [0x743c485c] [E] Fri Aug 23 13:28:41 2019: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (Validator::Message const&)>::operator()(Validator::Message const&) const+0x54 [0x74652360] [E] Fri Aug 23 13:28:41 2019: (5) /usr/local/lib/libVC4CC.so.1.2 : void Validator::Msg(int, msgTemplate, unsigned char) const+0x84 [0x74651c38] [E] Fri Aug 23 13:28:41 2019: (6) /usr/local/lib/libVC4CC.so.1.2 : Validator::ProcessItem(Validator::state&)+0x498 [0x7464f6e0] [E] Fri Aug 23 13:28:41 2019: (7) /usr/local/lib/libVC4CC.so.1.2 : Validator::Validate()+0x220 [0x74650974] [E] Fri Aug 23 13:28:41 2019: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::qpu_asm::CodeGenerator::toMachineCode(vc4c::Method&)+0x164 [0x743c25e8] [E] Fri Aug 23 13:28:41 2019: (9) /usr/local/lib/libVC4CC.so.1.2 : +0x558c0c [0x742c4c0c] [E] Fri Aug 23 13:28:41 2019: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x559ae0 [0x742c5ae0] [E] Fri Aug 23 13:28:41 2019: (11) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method const&)>::operator()(vc4c::Method const&) const+0x54 [0x742ca360] [E] Fri Aug 23 13:28:41 2019: (12) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method, std::vector<vc4c::Method, std::allocator<vc4c::Method> > >(std::vector<vc4c::Method, std::allocator<vc4c::Method> > const&, std::function<void (vc4c::Method const&)> const&, std::cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda()#1}::operator()() const+0xc8 [0x742c7fbc] [E] Fri Aug 23 13:28:41 2019: (13) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method, std::vector<vc4c::Method, std::allocator<vc4c::Method> > >(std::vector<vc4c::Method, std::allocator<vc4c::Method> > const&, std::function<void (vc4c::Method const&)> const&, std::cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x742cdb18] [E] Fri Aug 23 13:28:41 2019: (14) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x742bb7b4] [E] Fri Aug 23 13:28:41 2019: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x54e940 [0x742ba940] [E] Fri Aug 23 13:28:41 2019: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x54f22c [0x742bb22c] [E] Fri Aug 23 13:28:41 2019: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x54f198 [0x742bb198] [E] Fri Aug 23 13:28:41 2019: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x54f170 [0x742bb170] [E] Fri Aug 23 13:28:41 2019: (19) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x748ae9dc] [E] Fri Aug 23 13:28:41 2019: Background worker threw error: Verifier: vc4asm verification error: Warning V20.1: Cannot read register ra0 because it just has been written by the previous instruction. instruction at 0x41d8 referring to instruction at 0x41d0 [E] Fri Aug 23 13:28:41 2019: While running worker task: CodeGenerator [E] Fri Aug 23 13:28:41 2019: Compiler threw exception: Verifier: vc4asm verification error: Warning V20.1: Cannot read register ra0 because it just has been written by the previous instruction. instruction at 0x41d8 referring to instruction at 0x41d0

The kernel is like : __kernel void test_kernel( IMAGE_DECLARATION(in), IMAGE_DECLARATION(out), __global float matrix[9]) { Image in = CONVERT_TO_IMAGE_STRUCT_NO_STEP(in); Image out = CONVERT_TO_IMAGE_STRUCT(out);

float8 mtx = (float8)(matrix[0], matrix[1], matrix[2], matrix[3], matrix[4], matrix[5], 0.0, 0.0);

float4 in_x_coords = (float4)(0, 1, 2, 3);

float4 new_x = mad(in_x_coords, in_x_coords, in_x_coords);

uchar4 newxcv = convert_uchar4_rtn(new_x);

vstore4(newxcv, 0, out.ptr);

}

Do we support vector operation in VC4CL ? Why is there an error about "read register ra0" ? Look forward to your reply.

doe300 commented 5 years ago

Do we support vector operation in VC4CL ?

Yes, they are supported, since all operations are actually computed for 16-element SIMD anyway.

Why is there an error about "read register ra0" ?

Probably some compiler error. Basically, the verifier checks some hardware constraints, which the generated code violate. I will have a look at this when I have some time.

doe300 commented 5 years ago

How are these macros/types defined?

Jiaqing-Cao commented 5 years ago

Here is the definition of the macros/types:

define IMAGE_DECLARATION(name) \

__global uchar *name##_ptr,      \
uint        name##_stride_x, \
uint        name##_step_x,   \
uint        name##_stride_y, \
uint        name##_step_y,   \
uint        name##_offset_first_element_in_bytes

define CONVERT_TO_IMAGE_STRUCT(name) \

update_image_workitem_ptr(name##_ptr, name##_offset_first_element_in_bytes, name##_stride_x, name##_step_x, name##_stride_y, name##_step_y)

define CONVERT_TO_IMAGE_STRUCT_NO_STEP(name) \

update_image_workitem_ptr(name##_ptr, name##_offset_first_element_in_bytes, name##_stride_x, 0, name##_stride_y, 0)

/ Structure to hold Image information / typedef struct Image { __global uchar ptr; /< Pointer to the starting postion of the buffer */ int offset_first_element_in_bytes; /< The offset of the first element in the source image */ int stride_x; /*< Stride of the image in X dimension (in bytes) / int stride_y; /< Stride of the image in Y dimension (in bytes) */ } Image;

doe300 commented 5 years ago

Now it complains about update_image_workitem_ptr being missing. Can you please send me a fully compilable kernel source?

doe300 commented 4 years ago

Without having the full kernel code, there is not really much I can do here. Feel free to reopen when you send me a fully compilable kernel.