lion03 / thrust

Automatically exported from code.google.com/p/thrust
Apache License 2.0
0 stars 0 forks source link

Erratic Results thrust::transform with unsigned long long #344

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Please post a short, self-contained code sample which reproduces the
problem:
  Using thrust::transform on unsigned long long is giving me erratic results.

  Exact test::
     Does:
        a = UINT64_MAX-a;
     on each element in an array, check the result against the input every two iterations.  They should be equal.
  The attached code runs this test on host and device 50 times.
  If it fails, two files are generated: input_data and device_data
  with the input and the incorrect data.

What is the expected output? What do you see instead?
  Sometimes output is correct, but if this is iterated I am getting errors after a few iterations.  Examining the output data, most values are correct and the errors seem to be mostly transpositions of values, but some values are missing entirely.
  It gives the correct output when done on the host.

What version of Thrust are you using? Which version of nvcc?  Which host
compiler?  On what operating system?  Which GPU?
  I am using the current development version of thrust.  Tried Compiling with nvcc 4.0 and nvcc 3.2 and using cuda 4.0 production release drivers.  
The host machine is running 64-bit redhat linux 5.3.
The gpu is Tesla T10, part of a Tesla S1070, I have tried on all of the gpus on 
this device.
compiled with:
  nvcc -o test -arch=sm_13 -O2 minimal.cu

Please provide any additional information below.

Curious to see if anyone can replicate, its really puzzling me.  I've tried 
several variations on the sample I gave and I am getting the same behavior.  My 
original one was a unary function without constant iterator or thrust::minus 
and that also displayed the same problem.

If I set the vector size very small 1000 or so, then it seems to work.

Original issue reported on code.google.com by scott.ro...@gmail.com on 30 Jun 2011 at 7:00

Attachments:

GoogleCodeExporter commented 8 years ago
Simpler case:
  Does thrust::transform with thrust::identity in place and checks the answer.
  Displays the same behviour for me.

Original comment by scott.ro...@gmail.com on 30 Jun 2011 at 7:08

Attachments:

GoogleCodeExporter commented 8 years ago
Here I get the same behaviour without using thrust and doing the copying out of 
place.
I'll post it in on the nvidia forums, hopefully make sure I'm not missing 
something dumb.
In my tests:  
  The incorrect values always appear in pairs.
  If I use unsigned int, I don't have any problems.
  It does work sometimes (infrequently).

Original comment by scott.ro...@gmail.com on 30 Jun 2011 at 10:41

Attachments:

GoogleCodeExporter commented 8 years ago
Here's the result for my sm_11 device:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2011 NVIDIA Corporation
Built on Thu_May_12_11:09:45_PDT_2011
Cuda compilation tools, release 4.0, V0.2.1221

$ cat /proc/driver/nvidia/gpus/0/*
Model:       Quadro NVS 160M
IRQ:         16
Video BIOS:      62.98.68.00.04
Card Type:   PCI-E
DMA Size:    40 bits
DMA Mask:    0xffffffffff
Bus Location:    0000:01.00.0
Binary: ""

$ ./a.out 
testing 100 times on device 0
Output correct 100 times on device 0

Original comment by wnbell on 30 Jun 2011 at 6:20

GoogleCodeExporter commented 8 years ago
I can reproduce the problem on an sm_10 device:

jhoberock@nvresearch-test0:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2011 NVIDIA Corporation
Built on Thu_May_12_11:09:45_PDT_2011
Cuda compilation tools, release 4.0, V0.2.1221
jhoberock@nvresearch-test0:~$ cat /proc/driver/nvidia/gpus/0/*
Model:       GeForce 8800 GTS 512
IRQ:         16
Video BIOS:      ??.??.??.??.??
Card Type:   PCI-E
DMA Size:    40 bits
DMA Mask:    0xffffffffff
Bus Location:    0000:03.00.0
Binary: ""
jhoberock@nvresearch-test0:~$ ./a.out 
testing 100 times
a.out: submitted.cu:48: int main(int, char**): Assertion 
`std::equal(h_vec1.begin(), h_vec1.end(), h_vec2.begin())' failed.
Aborted

I've attached a smaller version of the code which I've forwarded as nvbug 846540

Original comment by jaredhoberock on 30 Jun 2011 at 6:27

Attachments: