tozhovez / cudpp

Automatically exported from code.google.com/p/cudpp
Other
0 stars 0 forks source link

non-multiple of 16 sizes seem to fail for radixsort #69

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. running cuddpSort with radix sort functionality
2. setting it up for 3991 as the number of elements to sort
3.

What is the expected output? What do you see instead?
Expected outcome should be a sorted device array which can then be passed to 
other functions for further processing. Instead i get this error message.

Cuda error: radixSortBlocks in file 'c:/Users/rohitg/Downloads/cudpp_src_1.1.1/c
udpp/src/app/radixsort_app.cu' in line 135 : unknown error.
What version of the product are you using? On what operating system?

Please provide any additional information below.
I am using Visual C++ 2008 with CUDA 3.1 and CUDPP 1.1.1 on an x64 machine.
My project involves calling a function repeatedly.
*************************************************
main{

function_Call(n) //top_level

}

function_Call(int n){

setup the sort plan for n.

some other kernel calls

call sort plan

use cudPPsort  results in other kernels

some more other kernel calls

destroy plan

free all memory 
return
}
*************************************************
I have placed error checking for all kernels surrounding the cudppsort call and 
also at the time of setting up the plan.

First time the top_level function is called it is called with n=4096. The next 
time it is called with n=3991 and then the error i mentioned earlier pops up.

Could you suggest if there is something wrong i might be doing or is happening?

thanks and regards

rohit

Original issue reported on code.google.com by itabhiya...@gmail.com on 16 Dec 2010 at 4:05

GoogleCodeExporter commented 8 years ago
Looking at test_radixsort.cpp in cudpp_testrig, I note that we test with many 
non-multiple-of-16 sizes:

    size_t test[] = {39, 128, 256, 512, 513, 1000, 1024, 1025, 32768, 
                     45537, 65536, 131072, 262144, 500001, 524288, 
                     1048577, 1048576, 1048581, 2097152, 4194304, 
                     8388608};

Can you first run the testrig and see if it works properly for you, and then 
replace one of the test[] items with 3991 and see if it works too? That would 
help to isolate the problem.

Original comment by jow...@gmail.com on 16 Dec 2010 at 5:13

GoogleCodeExporter commented 8 years ago
Hi

I ran test rig and also changed one of the values to arbit values as my 
simulation throws onto the CUDPP library (3991, 3642 etc.). That didn't create 
any problems in the testrig (All tests passed).

However now i also see that this error pops up but i also see in the error log 
that stores the result of cudaGetErrorString (cudaGetLastError()) that the 
mersenne twister random number generator also flags an unknown error.
But after a couple of kernels flag this error and this kernel in the log showed 
"no error" this error of radix sort pops up.

Which makes me think the error is somewhere else and the asynchronous nature of 
error reporting is probably confusing me?

Do you think so too? Thanks for your prompt reply and help.

rohit

Original comment by itabhiya...@gmail.com on 17 Dec 2010 at 9:37

GoogleCodeExporter commented 8 years ago
Hi

I tried commenting each function one by one and only time when i don't get 
errors is when i comment ONLY the declaration(and destroy) of Cudppplan for 
sort and calling cudppsort.

This only happens when i pass values of n other than 4096 (my starting value, 
after that the number of elements to sort decreases by 17 every time i declare 
a new plan in every new iteration)

SO n keeps changing. all other kernels uncommented and cudpp calls commmented 
makes my code run.

and the error is the same as i have written before.

thanks

Original comment by itabhiya...@gmail.com on 17 Dec 2010 at 2:35

GoogleCodeExporter commented 8 years ago
Hi Rohit,

Can you please send me your complete code sample implementing cudpp. I will try 
and reproduce the issue.

Thanks,
Ritesh

Original comment by rites...@gmail.com on 31 Jan 2011 at 6:40

GoogleCodeExporter commented 8 years ago
No response and insufficient info.  Closing.

Original comment by harr...@gmail.com on 6 Jul 2011 at 2:16