The attached program reproduces the problem. A sensible value for the Ocelot
backends is the same value as maxThreadsPerBlock in the cudaFuncAttributes
struct (16K IIRC).
$ nvcc ocelot_bug.cu
$ ./a.out
Device: NVS 4200M
maxThreadsPerMultiprocessor 1536
$ nvcc ocelot_bug.cu `OcelotConfig -l`
$ ./a.out
Device: NVS 4200M
maxThreadsPerMultiprocessor 0
Device: Ocelot PTX Emulator
maxThreadsPerMultiprocessor 0
Device: Ocelot Multicore CPU Backend (LLVM-JIT)
maxThreadsPerMultiprocessor 0
Original issue reported on code.google.com by wnbell on 6 Oct 2011 at 7:18
Original issue reported on code.google.com by
wnbell
on 6 Oct 2011 at 7:18Attachments: