Thrust should be strict about checking whether it is safe to #include device-specific code

GoogleCodeExporter commented 9 years ago

The following code, compiled with gcc, produces undefined references at
link time, rather than an error at compile time:

// bug.cpp:
#include <thrust/device_vector.h>
#include <thrust/sequence.h>

int main(void)
{
  unsigned int N = 10;
  thrust::device_vector<int> v(N);
  thrust::sequence(v.begin(), v.end());
  return 0;
}

Thrust should issue an error if an entry point cannot be compiled with the
given compiler/backend combination, as in this case:  gcc cannot compile
code that launches CUDA kernels, so don't try.

Original issue reported on code.google.com by jaredhoberock on 18 Jun 2009 at 1:47

GoogleCodeExporter commented 9 years ago

This one's on me. Found the macro definition for nvcc being the compiler. Just 
add in
every device .h file:
#ifndef __CUDACC__
#error You are not compiling with nvcc, perhaps the file extension should be .cu
#endif

You won't believe how much trouble gcc caused me with my "main.cpp" and 
Thrust...

Original comment by XTra.Kra...@gmail.com on 10 Aug 2009 at 9:34

GoogleCodeExporter commented 9 years ago

We should fix this in 1.2.

Original comment by jaredhoberock on 10 Dec 2009 at 6:38

Added labels: Milestone-Release1.2
Removed labels: Milestone-Release1.x

GoogleCodeExporter commented 9 years ago

We also need to check whether it's safe to #include platform-specific stuff 
like omp.h.

I don't know of a portable way to do this in the preprocessor.

Original comment by jaredhoberock on 11 Jan 2010 at 7:08

GoogleCodeExporter commented 9 years ago

_OPENMP is the portable macro we should use

Original comment by jaredhoberock on 12 Jan 2010 at 7:17

GoogleCodeExporter commented 9 years ago

r769 adds THRUST_STATIC_ASSERT and uses it in cuda::for_each's implementation.  
We should structure device backend implementations like this:

namespace cuda
{

for_each()
{
  THRUST_STATIC_ASSERT(THRUST_DEVICE_COMPILER == 
THRUST_DEVICE_COMPILER_NVCC);

  // call a function whose implementation contains device language-specific code
  call_code_containing_triple_chevrons();
}

}

The code which contains the device-language specific code can be safely elided 
without causing user confusion due to linker errors.  Instead, this scheme 
causes a 
compiler error:

yucky:thrust-static_assert jared$ g++ -x c++ test.cu -I. 
-I/usr/local/cuda/include
In file included from ./thrust/detail/device/cuda/for_each.h:53,
                 from ./thrust/detail/device/dispatch/for_each.h:20,
                 from ./thrust/detail/device/for_each.h:24,
                 from ./thrust/detail/device/generic/transform.inl:24,
                 from ./thrust/detail/device/generic/transform.h:81,
                 from ./thrust/detail/device/transform.h:24,
                 from ./thrust/detail/dispatch/transform.h:27,
                 from ./thrust/detail/transform.inl:24,
                 from ./thrust/transform.h:277,
                 from ./thrust/detail/device/cuda/copy_device_to_device.h:26,
                 from ./thrust/detail/device/cuda/dispatch/copy.h:24,
                 from ./thrust/detail/device/cuda/copy.h:21,
                 from ./thrust/detail/device/dispatch/copy.h:21,
                 from ./thrust/detail/device/copy.h:21,
                 from ./thrust/detail/dispatch/copy.h:32,
                 from ./thrust/detail/copy.inl:22,
                 from ./thrust/copy.h:285,
                 from ./thrust/detail/device_reference.inl:23,
                 from ./thrust/device_reference.h:962,
                 from ./thrust/detail/device_ptr.inl:23,
                 from ./thrust/device_ptr.h:343,
                 from ./thrust/device_malloc_allocator.h:27,
                 from ./thrust/device_vector.h:26,
                 from test.cu:1:
./thrust/detail/device/cuda/for_each.inl: In function ‘void 
thrust::detail::device::cuda::for_each(InputIterator, InputIterator, 
UnaryFunction)’:
./thrust/detail/device/cuda/for_each.inl:90: error: invalid application of 
‘sizeof’ to 
incomplete type ‘thrust::detail::STATIC_ASSERTION_FAILURE<false>’

Original comment by jaredhoberock on 31 Jan 2010 at 2:13

GoogleCodeExporter commented 9 years ago

It turns out the above scheme quite work because the argument to 
THRUST_STATIC_ASSERT is known prior to template instantiation time (i.e., it is 
known at preprocess time).  So the above example fails whenever any compiler 
other 
than nvcc compiles the code regardless of whether cuda::for_each was called.

Instead, we have to trick the compiler by making the argument to 
THRUST_STATIC_ASSERT depend on information known only at template instantiation 
time:

// the first parameter is ignored
template<typename, bool x>
  struct trick_compiler
{
  static const bool value = x;
};

namespace cuda
{

template<typename Iterator>
for_each()
{
  THRUST_STATIC_ASSERT(trick_compiler<Iterator, THRUST_DEVICE_COMPILER == 
THRUST_DEVICE_COMPILER_NVCC>::value);

  // call a function whose implementation contains device language-specific code
  call_code_containing_triple_chevrons();
}

}

Original comment by jaredhoberock on 31 Jan 2010 at 2:46

GoogleCodeExporter commented 9 years ago

This issue was closed by revision r787.

Original comment by jaredhoberock on 6 Feb 2010 at 3:01

Changed state: Fixed

allendaicool / thrust

Thrust should be strict about checking whether it is safe to #include device-specific code #8