lion03 / thrust

Automatically exported from code.google.com/p/thrust
Apache License 2.0
0 stars 0 forks source link

consider using ICC pragmas to vectorize loops in the CPU backends #262

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Is it possible to add "#pragma ivdep" [1] on the line following any "#pragma 
omp parallel for" directives?  This can facilitate vectorization with the Intel 
compiler, and my understanding is that allowing the compiler to ignore vector 
dependencies should always be valid: if there there are vector dependencies 
present which cannot be safely ignored, then there will also be race conditions 
introduced by OpenMP parallelization anyway.

[1] 
http://software.intel.com/sites/products/documentation/studio/composer/en-us/201
1/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm

Original issue reported on code.google.com by andrew.c...@gmail.com on 2 Nov 2010 at 7:07

GoogleCodeExporter commented 8 years ago
Yes, this should be possible.  Do you happen to know whether GCC or MSVC have 
similar mechanisms?  Also, do you happen to have a test case on hand that 
substantially benefits from the optimization?

Original comment by wnbell on 2 Nov 2010 at 7:35

GoogleCodeExporter commented 8 years ago
I think the answer is no for both GCC [1] and MSVC [2], unfortunately.   I will 
get back to you with an example.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33426
[2] http://msdn.microsoft.com/en-us/library/d9x1s805.aspx

Original comment by andrew.c...@gmail.com on 2 Nov 2010 at 7:53

GoogleCodeExporter commented 8 years ago

Original comment by wnbell on 6 Feb 2011 at 6:28

GoogleCodeExporter commented 8 years ago
Punt until we can test that these pragmas are effective.

Original comment by jaredhoberock on 8 Sep 2011 at 10:17

GoogleCodeExporter commented 8 years ago
With the recent code reorganization I'm not sure where these additions would 
go.  Arguably they could be applied to the scalar/ implementations with the 
appropriate #ifdef guards.  Alternatively, we could supply an ICC backend and 
make that compose-able with backend::omp and backend::tbb.

Of course we'd first want to make sure that #pragma ivdep was worth the bother 
at all.

Original comment by wnbell on 24 Jan 2012 at 2:01

GoogleCodeExporter commented 8 years ago
Forwarded to https://github.com/thrust/thrust/issues/76

Original comment by jaredhoberock on 7 May 2012 at 9:30