Closed GoogleCodeExporter closed 9 years ago
[deleted comment]
[deleted comment]
[deleted comment]
I'm sorry. I was compiling a .cpp file with nvcc. I symlinked it to bugs.cu
and recompiled and ran the tests again with CUDA.
fill_n, without SOA
nvcc 3.2rc2 (centos): compiles and gives correct results
fill_n, with SOA
nvcc 3.2rc2 (centos): does not compile thrust/detail/device/cuda/for_each.inl(72): error: no instance of overloaded function "thrust::detail::generate_functor<Generator>::operator() .... " matches the argument list
copy_n, without SOA
nvcc 3.2rc2 (centos): compiles and gives correct results
copy_n, with SOA
nvcc 3.2rc2 (centos): compiles and gives correct results
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 3:31
What about this one?
fill_n, with SOA
nvcc 3.2rc2 (centos): does not compile thrust/detail/device/cuda/for_each.inl(72): error: no instance of overloaded function "thrust::detail::generate_functor<Generator>::operator() .... " matches the argument list
Were you able to compile or not?
Original comment by jaredhoberock
on 10 Nov 2010 at 3:36
Hi Jared, It didn't compile. The compiler error message is right there in
what you just posted.
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 3:41
OK, thanks, we'll have a look.
Original comment by jaredhoberock
on 10 Nov 2010 at 3:44
Thanks!
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 3:50
This issue was closed by revision 70f1c9c123.
Original comment by jaredhoberock
on 10 Nov 2010 at 10:01
Thank you for fixing the compiler bug. fill_n compiles and works now in all
cases that I tested.
But copy_n + SOA is still not working with OpenMP, unless I compile with
optimization. The following is on a Mac with MacPorts g++-4.5, same thing with
Apple g++-4.2. I also tested with g++-4.5 in 64-bit CentOS, and got garbage
results, unless I compiled with optimization. Intel icpc is fine across both
Mac and CentOS.
humphrey% g++-mp-4.5 bugs.cpp -fopenmp -I ../lcpprivate/thrust
-DTHRUST_DEVICE_BACKEND=THRUST_DEVICE_BACKEND_OMP -o bugs
humphrey% ./bugs
starting
(5 (0 0 0) 0)
(0 (0 25 0) 0)
copying
getting values
printing
(2.12201e-314 (6.95322e-310 6.95322e-310 6.95322e-310) 2.12201e-314)
(6.95322e-310 (6.95322e-310 6.95322e-310 2.12201e-314) 6.95322e-310)
humphrey% g++-mp-4.5 -O3 bugs.cpp -fopenmp -I ../lcpprivate/thrust
-DTHRUST_DEVICE_BACKEND=THRUST_DEVICE_BACKEND_OMP -o bugs
humphrey% ./bugs
starting
(5 (0 0 0) 0)
(0 (0 25 0) 0)
copying
getting values
printing
(5 (0 0 0) 0)
(0 (0 25 0) 0)
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 10:40
Ok, so, it's not just SOA, even without AOS I still get garbage out unless I
compile with optimization, when using copy_n + make_constant_iterator. CUDA is
fine in all cases by the way.
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 10:48
Original comment by jaredhoberock
on 10 Nov 2010 at 10:49
Andrew,
Please let me know if the changes in my clone [1] this fixes these remaining
issues.
[1] https://jaredhoberock-thrust-no-referenced-state.googlecode.com/hg/
Original comment by jaredhoberock
on 10 Nov 2010 at 10:57
That did the trick! Thank you!
Original comment by andrew.c...@gmail.com
on 10 Nov 2010 at 11:02
This issue was closed by revision ac4def6859.
Original comment by jaredhoberock
on 10 Nov 2010 at 11:46
Original issue reported on code.google.com by
andrew.c...@gmail.com
on 10 Nov 2010 at 2:58Attachments: