lion03 / thrust

Automatically exported from code.google.com/p/thrust
Apache License 2.0
0 stars 0 forks source link

indexing bug in CUDA merge sort #296

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Reported by Gregory Diamos:

I've been tracking a bug in thrust's merge sort code running on Ocelot with a 
64-bit target and think that I narrowed it down to a bug in thrust involving 
index computation.  In the 'aligned_read' helper function for merge sort, an 
index i is computed as follows:

unsigned int i = warp_size - start_thread_aligned + threadIdx.x;

For certain values of start_thread_aligned, this will roll over to a negative 
number.  Next, this value is added to a pointer here:

 first1  += i + src_offset;
 first2  += i + src_offset;
 result1 += i;
 result2 += i;

If these are 64-bit pointers, 'i' will be zero extended to 64-bits, and then 
added to the pointers.  It should actually be sign extended.  This can be fixed 
by changing the type of 'i' from unsigned int to int.

This occurs on lines 668 and 606 of stable_merge_sort.inl.

I guess this has not been an issue on real hardware because these seem to be 
pointers to shared memory, which seem to wrap around excessively large values.

Should be able to eliminate aligned_read (and aligned_write) entirely in favor 
of block::copy

Original issue reported on code.google.com by jaredhoberock on 25 Jan 2011 at 12:40

GoogleCodeExporter commented 8 years ago
This issue was closed by revision cb73eb2506.

Original comment by jaredhoberock on 8 Feb 2011 at 5:04