VcDevel / Vc

SIMD Vector Classes for C++
BSD 3-Clause "New" or "Revised" License
1.44k stars 152 forks source link

Vc offloading on Xeon Phi #76

Open mzyzak opened 8 years ago

mzyzak commented 8 years ago

Dear Matthias,

can you please implement a possibility for Vc functionality offloading to the Intel Xeon Phi? Here is a code example which I tried to offload:

#include "iostream"
#include <omp.h>
#include "Vc/Vc"
using ::Vc::int_v;

__declspec( target(mic) ) void Increment(int& i)
{
  int_v n(i);
  n++;
  i = n[0];
}

int main()
{
  int nProc;
  int n = 0;
  Increment(n);

  #pragma offload target(mic) inout(n)
  {
    Increment(n);
    nProc = omp_get_num_procs();
  }

  std::cout << "N cores on MIC: " << nProc << std::endl;
  std::cout << "n: " << n << std::endl;

  return 0;
}

With best regards Maksym

mattkretz commented 8 years ago

Thanks for the test case. I'll try to look into this, but no promises on a timeline yet. But like we discussed, you should try annotating the MIC headers and see whether you can get further. If you do, feel free to contribute your changes!

mzyzak commented 8 years ago

Dear Matthias,

It looks like there are problems with inlining the code in offload compilation. As it mentioned here: https://software.intel.com/en-us/articles/effective-use-of-the-intel-compilers-offload-features the intel offload compiler does not inline the code. Therefore, when including Vc/mic/intrinsic.h, then compilation crashes with a catastrophic errors like: /home/mzyzak/Vc/Vc/build/include/Vc/mic/intrinsics.h(69): (col. 13) catastrophic error: MIC Intrinsic parameter must be an immediate value in a big number of functions declared there, because there the input for intrinsics are given as parameters.

Do you know, if we can overcome the problem? And are there ways to force icc to inline in the offload mode.

With best regards, Maksym

mattkretz commented 8 years ago

I looked at the Intel article and it says that inlining still works for an indirect function call inside an offload region. Since the intrinsics are all called indirectly through Vc you shouldn't see this error. Sorry, no idea yet until I have a chance to investigate offloading myself.