m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs
http://www.ilgpu.net
Other
1.41k stars 120 forks source link

Can multiple threads be available for CPU calculation of some algorithm such as Scan, RadixSort? #152

Closed GeoBIM2020 closed 3 years ago

GeoBIM2020 commented 4 years ago

CPU is used for debugging in current situation where there is no GPU debugging available in ILGPU. It seems that only one thread is set for some of available ILGPU.Algorithm with CPU. The problem with one thread CPU is that, e.g. I have several kernels, and also use RadixSort and Scan etc. of ILGPU.Algorithm that run before running other kernels. To debug other kernels, it will take a long time to wait until RadixSort/Sacn is completed if data set is very large. How can multiple threads be turned on for those provided algorithm to speed up CPU calculation? What is the timeline for GPU debugging? Thanks!

Yey007 commented 4 years ago

I'm not too sure if this is your problem, but you can provide the maximum amount of threads when creating a new CPU Accelerator

m4rs-mt commented 4 years ago

@toolwtech I already had a brief discussion about enhanced CPU-debugging capabilities with @MoFtZ. We think it makes sense to add extended CPUAccelerator features in one of the next versions.

However, I am not 100% sure whether I understand your problem correctly. As @Yey007 suggested, you can specify the number CPU accelerator threads explicitly. However, this does not affect the single-threaded processing capabilities of the CPU accelerator with respect to the current RadixSort implementation.

m4rs-mt commented 3 years ago

@GeoBIM2020 We are currently working on a new CPU accelerator that will allow programmers to mimic the warp/group setups of arbitrary GPUs. Stay tuned 🚀