jcohenpersonal / opencurrent

Automatically exported from code.google.com/p/opencurrent
2 stars 0 forks source link

7 of 17 unit tests failed #8

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. build for sm_10 (using netcdf4 and a GeForce 8800 GTX)
2. make test

What is the expected output? 
All tests should be ok.

What do you see instead?
see attached unittest.txt

What version of the product are you using? On what operating system?
1.0.0 on Ubuntu 8.04 LTS

Please provide any additional information below.
I ran utest MultigridTest under a debugger and found that at line 388 of 
sol_mgpressure3ddev.cu the value of _num_level is not 6 as it should be 
but a very large value. Something is obviously overwritten. I also ran 
utest MultigridTest under valgrind and the output is in the attached file 
memcheck.txt.

Original issue reported on code.google.com by asr@ddsw.nl on 16 Mar 2010 at 8:55

Attachments:

GoogleCodeExporter commented 9 years ago
Could you try checking out the latest version from hg and seeing if you can 
reproduce?

Instructions here:
http://code.google.com/p/opencurrent/source/checkout

Also, are you on a 32-bit or 64-bit OS?

Original comment by jcohen.p...@gmail.com on 16 Mar 2010 at 1:45

GoogleCodeExporter commented 9 years ago
Dear Jonathan,

I downloaded your latest versions per your instructions and I got the
attached output from the unit tests. My system is 32-bit and has a
GeForce 8800 GTX which only has floats, no doubles. I use NetCdf 4.0.

Running utest MultigridTest in a debugger now revealed that the
segmentation fault was caused by 'this' in the code below having the
value 0xad0 which is obviously not a valid address of an object in
Linux.

Grid2DDevice<T>::~Grid2DDevice()
{
   cudaFree(this->_buffer);
}

This destructor was called from delete _diag_grid[l]; (line 335
sol_mgpressure3ddev.cu).

Hope this helps!

Original comment by asr@ddsw.nl on 19 Mar 2010 at 9:23

Attachments:

GoogleCodeExporter commented 9 years ago
Dear Jonathan,
Last week I got a new graphics card. I now have a GTX-285. This made things 
even worse.
I recompiled for sm_13 and got the attached output for make test. This is the 
version
I downloaded on March 19. I also tried the official version 1.0.0 again and 
here too
things got worse. I attach both make test outputs.
I hope you can figure out what is wrong. 
Sincerely, Anneke

Original comment by asr@ddsw.nl on 31 Mar 2010 at 9:31

Attachments:

GoogleCodeExporter commented 9 years ago
hmm, it looks like this has nothing to do with the graphics card, and maybe 
nothing
to do with cuda.  Might be an issue with how netcdf was built?

My guess is a libc incompatibility or something like that.
Could you run 'ldd -v tests/utest' and attach the results?

Original comment by jcohen.p...@gmail.com on 31 Mar 2010 at 1:42

GoogleCodeExporter commented 9 years ago
Here are the results. btw, if I compile for netcdf 3, which is included in 
Ubuntu
8.04, so not built by me, the problems are only slightly different, see the 
other
attachment which was made with the recent opencurrent snapshot, not with 1.0.0.

Original comment by asr@ddsw.nl on 31 Mar 2010 at 3:22

Attachments:

GoogleCodeExporter commented 9 years ago
are you able to compile and run the cuda sdk samples?  or any cuda progam?

Original comment by jcohen.p...@gmail.com on 31 Mar 2010 at 3:34

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Yes I am able to compile and run all CUDA example programs plus Iterative CUDA 
and 
OpenNL with CUDA support. 
I hope this message gets through now at last, I tried it over a few weeks 
repeatedly. There seems to be some problem with me contacting this bulletin 
board 
system.

Original comment by asr@ddsw.nl on 3 May 2010 at 12:29