m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs
http://www.ilgpu.net
Other
1.38k stars 117 forks source link

VelocityDevice and MaxGridSize #1115

Closed m0bygit closed 1 year ago

m0bygit commented 1 year ago

Hi, there might be a problem with the MaxGridSize of the Velocity devices. Currently it is initialized as:

MaxGridSize = new Index3D(int.MaxValue, 1, 1); // VelocityDevice.cs line 81

Is that really correct? It throws an ArgumentOutOfRangeException when using a ArrayView2D (since the Y dimension of the array is most likely larger that 1). :)

I guess I can rework the kernel to use an ArrayView1D instead, but I'm guessing this is a bug? Thank you for a really great library! The latest addition of the Velocity devices made my program on M1 Max a lot faster.

m4rs-mt commented 1 year ago

@m0bygit Thank you for your question and testing the pre-alpha version of our Velocity accelerator. Moreover, Velocity currently supports 1D kernels only. As we continue to considerably improve performance by adding Arm64 (AdvSimd)-compatible and AVX/AVX2-compatible instruction generation in the next couple of weeks, we may also reconsider adding multi-dimensional support to the initial Velocity release. However, this requires remapping of multi-dimensional kernels to 1D linear kernels and further reconstruction steps inside the new accelerator.

m0bygit commented 1 year ago

@m4rs-mt Thank you for the clarification. I am now using a 1D kernel instead. Looking forward to even more pre-alpha versions and performance improvements. :) Have a great day and thanks again for this great library.

m4rs-mt commented 1 year ago

@m0bygit Thank you so much for your feedback!