byzhang / cudpp

Automatically exported from code.google.com/p/cudpp
Other
0 stars 0 forks source link

tridiagonal solver fails for systems of two equations. (NaN in place of result) #124

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Modify the test_tridiagonal example that comes with cudpp. The function 
testTridiagonalDataType declares an array of systemSizes as follows:

    int systemSizes[] = { 5, 32, 39, 128, 177, 255, 256, 500, 512 };

2. Replace that line for something like:

    int systemSizes[] = { 2, 5, 32 }; // only interested in the first one

3. Then compile and execute test_tridiagonal.

test@test:~/cudpp/bin$ ./test_tridiagonal 
Using device 0:
Tesla C2070; global mem: 1341587456B; compute v2.0; clock: 1147000 kHz
Running a fp64 CR-PCR tridiagonal solver solving 512 systems of 2 equations
GPU execution time: 0.044000 ms
CPU execution time: 0.022000 ms
test failed, error is larger than 0.001
test FAILED  <<<<<<<<<<<<<<<<<<<<<<<  2 EQUATIONS SYSTEMS FAIL

Running a fp64 CR-PCR tridiagonal solver solving 512 systems of 5 equations
GPU execution time: 0.065000 ms
CPU execution time: 0.054000 ms
test PASSED

Running a fp64 CR-PCR tridiagonal solver solving 512 systems of 32 equations
GPU execution time: 0.125000 ms
CPU execution time: 0.340000 ms
test PASSED

What is the expected output? What do you see instead?

On a closer inspection (using debugger) the output array (x) contains NaN

What version of the product are you using? On what operating system?

This is happening with the latest version of CUDPP 2.0 available from git on 
LTS 12.04 and using CUDA 5.0 with a GPU Tesla C2070

Please provide any additional information below.

I tried this one on an emulated environment (using Ocelot) achieving the same 
results.

Original issue reported on code.google.com by omar.val...@gmail.com on 22 Nov 2012 at 2:42

GoogleCodeExporter commented 9 years ago
Yao can you please investigate?

Original comment by harr...@gmail.com on 11 Dec 2012 at 2:07

GoogleCodeExporter commented 9 years ago
Sure, will take a look by this week. 

I'm curious why (in what applications) a 2-equation system is of any interest. 
If you really need performance in addition to correctness (which I hope I will 
fix), I don't think this tridiagonal routine will perform well, because a 
2-equation system will lead to low utilization of the GPU with only 1 thread 
per block.

Original comment by zhangyao...@gmail.com on 11 Dec 2012 at 5:15

GoogleCodeExporter commented 9 years ago
Yes, you're right. I am not intending to use the tridiagonal solver for small 
systems. My code uses a finite difference routine that starts with a small 
linear eq. system and progressively refines the step size (number of equations 
to solve increases) until the solution reaches the desired accuracy.

In fact, the code will branch between CPU/GPU execution depending on the size 
of n. And I wanted to plot CPU vs GPU timings just to find out what will be 
probably the sweet spot for doing so. That's how I find out about those NaN. 

Original comment by omar.val...@gmail.com on 12 Dec 2012 at 11:09