Use opt for auto-vectorization

doe300 / VC4C

Compiler for the VC4CL OpenCL implementation

MIT License

118 stars 37 forks source link

Use opt for auto-vectorization #116

Closed nomaddo closed 6 years ago

nomaddo commented 6 years ago

This pullreq add auto vectorization by opt using option -force-vector-width=16.

After applying this patch, vectorization works.

[ ] check whether compilation results are successful in raspberry pi
[x] check whether the compilation of VC4C itself is successful in any-place
[ ] check the performance

nomaddo commented 6 years ago

For below kernel with n = 12000, there is no performance difference between --use-opt and --no-opt ...

kernel void hello(global float * x, global float * y, int n){
  int id = get_global_id(0);
  int all = get_global_size(0);
  int offset = id * n / all;

  for (int i = 0; i < n / all; i++)
    x[i + offset] += y[i + offset] * 2;
}

I see the intermediate languages. Auto vectorization works.... Maybe, optimization by opt is not effective like aggressive loop unrolling due to the short of instruction cache

nomaddo commented 6 years ago

@doe300 Does VC4CL have a way to pass VC4C's compilation options? I want it to test this patch.

doe300 commented 6 years ago

In Program.cpp it passes the options from the OpenCL calls to VC4C in precompile_program and link_programs to the compilation steps

doe300 commented 6 years ago

Can you fix the code style issues? Otherwise it looks good to merge.

nomaddo commented 6 years ago

Before my commits, I did make clang-format but diffs are appeared.....

I think the config of clang-format is not complete: it is written in src/CMakeLists.txt and the target is only ${VC4CC_SRCS} and ${VC4C_SRCS}. Din't work for files in include.