What steps will reproduce the problem?
1. Given a key array of floating points (float or double) and a value array of
integer. For example:
//key[i] = (double)N / (double)i;
double key[5] = [0.0, 5.0, 2.5, 1.66667, 1.25 ] ;
int value[5] = [0, 1, 2, 3, 4] ;
2. Use the radix_sort from Branch FastSortSm20. The key value between 1.xx will
be greater than any other elements.
The output is
key[] = [0, 2.5, 5, 1.25, 1.66667 ]
value[] = [0, 2, 1, 4, 3, ]
3. Use the radix_sort in trunk gives the correct result.
key[] = [0, 1.25, 1.66667, 2.5, 5 ]
value[] = [0, 4, 3, 2, 1 ]
What version of the product are you using? On what operating system?
Ubuntu 10.10 x64
CUDA 4.0 (GPU GTX 470)
r893, branches/FastSortSm20
r893, trunk
Please provide any additional information below.
I want to sort the elements in each row vector of a floating poinrt 2D matrix.
The Entactor::SmallSort() interface in the branch FastSortSm20 seems a good
fit. Even better, it allows specifying cudaStream in the interface. I want to
split the row vectors by several streams to utilize the concurrent kernel
execution. Note that the Compute Capability 2.0 hardware supports 16 concurrent
kernels.
Original issue reported on code.google.com by addee2...@gmail.com on 3 Jul 2012 at 10:26
Original issue reported on code.google.com by
addee2...@gmail.com
on 3 Jul 2012 at 10:26