Compile time continues to get longer as we add more functionality. CUDA is
really slow at
compiling template functions with multiple parameters, and we use a lot. There
are something like
384 different scan kernels, for example, and a similar number for segscan.
How can we reduce this code explosion? Can we give feedback to the CUDA
compiler team? (Emu
mode compiles WAY faster for example).
Original issue reported on code.google.com by harr...@gmail.com on 25 Jun 2009 at 12:17
Original issue reported on code.google.com by
harr...@gmail.com
on 25 Jun 2009 at 12:17