Reported by Gregory Diamos:
I've been tracking a bug in thrust's merge sort code running on Ocelot with a
64-bit target and think that I narrowed it down to a bug in thrust involving
index computation. In the 'aligned_read' helper function for merge sort, an
index i is computed as follows:
unsigned int i = warp_size - start_thread_aligned + threadIdx.x;
For certain values of start_thread_aligned, this will roll over to a negative
number. Next, this value is added to a pointer here:
first1 += i + src_offset;
first2 += i + src_offset;
result1 += i;
result2 += i;
If these are 64-bit pointers, 'i' will be zero extended to 64-bits, and then
added to the pointers. It should actually be sign extended. This can be fixed
by changing the type of 'i' from unsigned int to int.
This occurs on lines 668 and 606 of stable_merge_sort.inl.
I guess this has not been an issue on real hardware because these seem to be
pointers to shared memory, which seem to wrap around excessively large values.
Should be able to eliminate aligned_read (and aligned_write) entirely in favor
of block::copy
Original issue reported on code.google.com by jaredhoberock on 25 Jan 2011 at 12:40
Original issue reported on code.google.com by
jaredhoberock
on 25 Jan 2011 at 12:40