doe300 / VC4C

Compiler for the VC4CL OpenCL implementation
MIT License
118 stars 37 forks source link

Compilation failed in simple loop #97

Closed long-long-float closed 6 years ago

long-long-float commented 6 years ago

This program cannot be compile at VC4C --asm -O3 --fno-extract-loads-from-loops -o loop loop.cl.

__kernel void loop (__global float * a){
    for (int i = 0; i < 100; i++) {
        a[i] = i * 2;
    }
}

Compilation is success at VC4C --asm -O2 -o loop loop.cl, so I think vectorize-loops has a problem.

doe300 commented 6 years ago

The compilation works for me with the latest commit (8dffec4), though the vectorization is not applied. Can you re-check and if it still fails, give some more details about the error?

long-long-float commented 6 years ago

I checked at latest commit. But the same error is occurred. Here is error log.

[D] Thu May 10 23:56:38 2018: Disabling optimization: extract-loads-from-loops
[D] Thu May 10 23:56:38 2018: Compiling 'loop.cl' into 'loop' with optimization level 3 and options '' ...
[D] Thu May 10 23:56:38 2018: Temporary file '/tmp/vc4c-1hKH9Y' created
[I] Thu May 10 23:56:38 2018: Compiling OpenCL to LLVM-IR with: /usr/bin/clang-4.0 -cc1 -triple spir-unknown-unknown  -I . -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant  -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -include-pch /path/to/VC4C/../VC4CLStdLib/include/VC4CLStdLib.h.pch -x cl -S -emit-llvm-bc -o /tmp/vc4c-1hKH9Y loop.cl
[I] Thu May 10 23:56:41 2018: Compilation complete!
[D] Thu May 10 23:56:41 2018: Reading LLVM module from bit-code...
[D] Thu May 10 23:56:41 2018: Found SPIR kernel-function: loop
[D] Thu May 10 23:56:41 2018: Reading function void loop(...)
[D] Thu May 10 23:56:41 2018: Reading parameter f32* %a
[D] Thu May 10 23:56:41 2018: Mapping function 'loop'...
[D] Thu May 10 23:56:41 2018: Generating label label %tmp.0
[D] Thu May 10 23:56:41 2018: Generating unconditional branch to 0x11fa0d8
[D] Thu May 10 23:56:41 2018: Generating label label %tmp.2
[D] Thu May 10 23:56:41 2018: Generating return nothing
[D] Thu May 10 23:56:41 2018: Generating label label %tmp.1
[D] Thu May 10 23:56:41 2018: Generating Phi-Node with 2 options into i32 %i.05
[D] Thu May 10 23:56:41 2018: Generating binary operation shl with i32 %i.05 and i32 1 into i32 %mul
[D] Thu May 10 23:56:41 2018: Generating unary operation sitofp with i32 %mul into f32 %conv
[D] Thu May 10 23:56:41 2018: Generating calculating index i32 %i.05 of f32* %a into f32* %arrayidx
[D] Thu May 10 23:56:41 2018: Generating writing of f32 %conv into f32* %arrayidx
[D] Thu May 10 23:56:41 2018: Generating binary operation add with i32 %i.05 and i32 1 into i32 %inc
[D] Thu May 10 23:56:41 2018: Generating comparison slt with i32 %inc and i32 100 into %cmp
[D] Thu May 10 23:56:41 2018: Generating branch on condition bool %cmp to either label %tmp.1 or label %tmp.2
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: EliminatePhiNodes
[D] Thu May 10 23:56:41 2018: Eliminating phi-node by inserting moves: i32 %i.05 = phi %tmp.1 -> i32 %inc, %tmp.0 -> i32 0
[D] Thu May 10 23:56:41 2018: Inserting 'move' into end of basic-block: %tmp.1
[D] Thu May 10 23:56:41 2018: Inserting 'move' into end of basic-block: %tmp.0
[I] Thu May 10 23:56:41 2018: -----
[I] Thu May 10 23:56:41 2018: Inlining functions for kernel: loop
[I] Thu May 10 23:56:41 2018: -----
[D] Thu May 10 23:56:41 2018: -----
[I] Thu May 10 23:56:41 2018: Running normalization passes for: loop
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: MapMemoryAccess
[D] Thu May 10 23:56:41 2018: Generating memory access which cannot be lowered into VPM: store f32 %conv into f32* %arrayidx
[D] Thu May 10 23:56:41 2018: 0 16 0 1
[D] Thu May 10 23:56:41 2018: Found base address - with offset -1 for writing into memory
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: ResolveStackAllocations
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: Intrinsics
[D] Thu May 10 23:56:41 2018: Intrinsifying multiplication with left-shift
[D] Thu May 10 23:56:41 2018: Intrinsifying comparison 'slt' to arithmetic operations
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: EliminateReturns
[D] Thu May 10 23:56:41 2018: Replacing return in kernel-function with branch to end-label
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: HandleLiteralVector
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: MapGlobalDataToAddress
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: HandleImmediates
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value 0 to: 0 (0)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value 1 to: 1 (1)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value 2 to: 2 (2)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value 1 to: 1 (1)
[D] Thu May 10 23:56:41 2018: Calculating immediate value 100 with operation 'mul24' and immediate value 10 (10)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value true to: 1 (1)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value true to: 1 (1)
[D] Thu May 10 23:56:41 2018: Mapping constant for immediate value true to: 1 (1)
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: CheckNormalized
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: AddStartStopSegment
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: UnrollWorkGroups
[I] Thu May 10 23:56:41 2018: 
[I] Thu May 10 23:56:41 2018: Normalization done, changed number of instructions from 17 to 37
[D] Thu May 10 23:56:41 2018: -----
[D] Thu May 10 23:56:41 2018: -----
[I] Thu May 10 23:56:41 2018: Running optimization passes for: loop
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: CombineDuplicateBranches
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: MergeBasicBlocks
[D] Thu May 10 23:56:41 2018: CFG created/updated for function: loop
[D] Thu May 10 23:56:41 2018: Found basic block with single direct successor: label: %start_of_function and label: %tmp.0
[D] Thu May 10 23:56:41 2018: Removing basic block 'label: %tmp.0' from function loop
[D] Thu May 10 23:56:41 2018: Merged block label %tmp.0 into label %start_of_function
[D] Thu May 10 23:56:41 2018: Merged 1 pair of blocks!
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: SingleSteps
[D] Thu May 10 23:56:41 2018: Running steps: EliminateUselessBranch, CombineSelectionWithZero, CombineSettingSameFlags, 
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: CombineRotations
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: GeneralOptimizations
[D] Thu May 10 23:56:41 2018: Replacing 'i32 %immediate.8 = mul24 i32 10 (10), i32 10 (10)' with constant value: i32 100
[D] Thu May 10 23:56:41 2018: Replacing obsolete move with instruction calculating its source: register vpm = f32 %conv
[D] Thu May 10 23:56:41 2018: Replacing obsolete move with instruction calculating its source: register vpw_addr = f32* %arrayidx
[D] Thu May 10 23:56:41 2018: replaceValue: replace i32 %immediate.8 to i32 100 in i32 %icomp.4 = max i32 %inc, i32 %immediate.8
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: EliminateDeadStores
[D] Thu May 10 23:56:41 2018: Removing instruction i32 %immediate.8 = i32 100, since its output is never read
[D] Thu May 10 23:56:41 2018: Cleaned 9 unused locals from method loop
[D] Thu May 10 23:56:41 2018: 
[D] Thu May 10 23:56:41 2018: Running pass: VectorizeLoops
[D] Thu May 10 23:56:41 2018: CFG created/updated for function: loop
[D] Thu May 10 23:56:41 2018: Found a control-flow loop: label: %end_of_function -> label: %start_of_function -> label: %tmp.1 -> label: %tmp.2 -> 
[D] Thu May 10 23:56:41 2018: Found a control-flow loop: label: %tmp.1 -> 
[D] Thu May 10 23:56:41 2018: Failed to find loop iteration variable for loop
[D] Thu May 10 23:56:41 2018: Loop iteration variable candidate: i32 %i.05
[D] Thu May 10 23:56:41 2018: Found lower bound: i32 0
[D] Thu May 10 23:56:41 2018: Found iteration instruction: i32 %inc = add i32 %i.05, i32 1 (1)
[D] Thu May 10 23:56:41 2018: Found loop repetition branch: br.ifzc %tmp.1 (on bool %cmp) (ifzc )
[D] Thu May 10 23:56:41 2018: Found loop continue condition: register - = xor i32 %icomp.4, i32 %inc (setf )
[D] Thu May 10 23:56:41 2018: Found upper bound: i32 %icomp.4
[D] Thu May 10 23:56:41 2018: Found comparison type: eq
[D] Thu May 10 23:56:41 2018: Found maximum used vector-width of 1 elements
[E] Thu May 10 23:56:41 2018: Background worker threw error: Attempt to access value of a disengaged optional object
terminate called after throwing an instance of 'std::experimental::fundamentals_v1::bad_optional_access'
  what():  Attempt to access value of a disengaged optional object
fish: './build/VC4C --asm -O3 --fno-ex…' terminated by signal SIGABRT (Abort)
doe300 commented 6 years ago

@long-long-float can you re-test again whether this issue is fixed?

long-long-float commented 6 years ago

@doe300 It works fine! Thank you.