When profiling Thrust applications with tools like the CUDA Visual Profiler or
Parallel Nsight it would be useful to have algorithms reported in a more
straightforward way. Specifically, rather than having each individual kernel,
with their lengthly mangled names, appear in the profile it would be preferable
to have simple expressions like "thrust::sort" or perhaps
"thrust::sort<FooIterator>".
Ideally the tools above would support some sort of stack-based mechanism to
aggregate and nest kernels into logical algorithms.
Original issue reported on code.google.com by wnbell on 13 Oct 2010 at 12:54
Original issue reported on code.google.com by
wnbell
on 13 Oct 2010 at 12:54