wow2006 / cudpp

Automatically exported from code.google.com/p/cudpp
Other
0 stars 0 forks source link

Sorting in emulation mode broken #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

Compile and run the following in emulation mode:

#include <stdio.h> 
#include <cudpp/cudpp.h> 
#include <cuda_runtime.h> 
#include <cutil_inline.h> 
typedef unsigned int uint; 
#define N 12 
uint keys[N]    = {111, 37, 430, 433, 431, 357, 6190, 6193, 6191, 
6117, 6837, 6911}; 
uint values[N]  = {37, 111, 433, 430, 357, 431, 6193, 6190, 6117, 
6191, 6911, 6837}; 
int main(){ 
        cudaSetDevice(0); 
        int* keys_dev = 0; 
        int* vals_dev = 0; 
        cutilSafeCall(cudaMalloc((void**)&keys_dev, sizeof(uint) * N)); 
        cutilSafeCall(cudaMalloc((void**)&vals_dev, sizeof(uint) * N)); 
        CUDPPConfiguration sortConfig; 
        sortConfig.algorithm = CUDPP_SORT_RADIX; 
        sortConfig.datatype  = CUDPP_UINT; 
        sortConfig.op            = CUDPP_ADD; 
        sortConfig.options       = CUDPP_OPTION_KEY_VALUE_PAIRS; 
        CUDPPHandle sortPlan; 
        cudppPlan(&sortPlan, sortConfig, 100 /* num elements */, 1 /* num 
rows */, 100 /* pitch */); 
        printf("Before\n"); 
        for (uint i = 0; i < N; i++) { 
                printf("(%d,\t%d)\n", keys[i], values[i]); 
        } 
        cutilSafeCall(cudaMemcpy(keys_dev, keys,        sizeof(uint) * N, 
cudaMemcpyHostToDevice)); 
        cutilSafeCall(cudaMemcpy(vals_dev, values,      sizeof(uint) * N, 
cudaMemcpyHostToDevice)); 
        cudppSort(sortPlan, keys_dev, vals_dev, 32, N); 
        cutilSafeCall(cudaMemcpy(keys,          keys_dev, sizeof(uint) * N, 
cudaMemcpyDeviceToHost)); 
        cutilSafeCall(cudaMemcpy(values,        vals_dev, sizeof(uint) * N, 
cudaMemcpyDeviceToHost)); 
        printf("After\n"); 
        for (uint i = 0; i < N; i++) { 
                printf("(%d,\t%d)\n", keys[i], values[i]); 
        } 
} 

What is the expected output? What do you see instead?

The output should be a list of sorted keys + values.  Instead:

Before 
(111,   37) 
(37,    111) 
(430,   433) 
(433,   430) 
(431,   357) 
(357,   431) 
(6190,  6193) 
(6193,  6190) 
(6191,  6117) 
(6117,  6191) 
(6837,  6911) 
(6911,  6837) 

After 
(37,    111) 
(111,   37) 
(357,   431) 
(357,   431) 
(357,   431) 
(430,   433) 
(6117,  6191) 
(6117,  6191) 
(6117,  6191) 
(6190,  6193) 
(6837,  6911) 
(6911,  6837) 
(6911,  6837) 

Key/value pairs are indeed sorted however some pairs have been duplicated 
whereas others have  
been deleted.

What version of the product are you using? On what operating system?

Using the version bundled with the CUDA toolkit v3.0 beta1 on both MacOS 10.6 
and Ubuntu 
9.04

Please provide any additional information below.

Works correctly when run on the device.

Original issue reported on code.google.com by stuff21@gmail.com on 20 Nov 2009 at 9:46

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by harr...@gmail.com on 14 Dec 2009 at 6:24

GoogleCodeExporter commented 9 years ago
Fixed by r45 -- add synchronization in emulation mode to the single-warp sort 
kernel.

Original comment by harr...@gmail.com on 14 Dec 2009 at 7:10