Hardware-accelerated Vector Compute Library for .NET Containing Quality of life improvements and functionality intended for data science, graphical processing and GPGPU.
I moved all the kernels into into the GPU class, and precompiled them after we get the accelerator. This moves around 400ms of latency away from the first kernel operation and into constructing the GPU.
I also changed a few memory allocations to reduce garbage generation.
I moved all the kernels into into the GPU class, and precompiled them after we get the accelerator. This moves around 400ms of latency away from the first kernel operation and into constructing the GPU.
I also changed a few memory allocations to reduce garbage generation.