sinshu / numflat

A numerical computation library for C#
MIT License
22 stars 3 forks source link

Can we achieve hardware acceleration like Vector3 for System. numbers #30

Open a1821216780 opened 1 month ago

a1821216780 commented 1 month ago

Can we achieve hardware acceleration like Vector3 for System. numbers

sinshu commented 1 month ago

I investigated whether I could accelerate matrix-related calculations by using the experimental TensorPrimitives with generic math support, introduced in .NET 9. After benchmarking some matrix decomposition tasks, it seems that hardware acceleration is particularly effective for larger matrices. You can see the results here.

However, as of now, the generic math support for TensorPrimitives is provided as an experimental package, and the installation process is rather complex. Therefore, I think it will be quite some time before hardware acceleration is added to NumFlat itself.

If you want to use hardware acceleration right away, there are methods such as using OpenBLAS with NumFlat's matrix data, as explained here, so please consider that option.

a1821216780 commented 1 month ago

I want to ask a question. MKL, openblas and other accelerators can be used in mathnet. In numflat's work, the comparison is based on mathnet using a mathematical accelerator or mathnet without an accelerator. Whether to consider adding mathnet for comparison with mathnet using mathematical accelerator. Because there may be communication loss when using accelerator.

sinshu commented 1 month ago

If your question is whether the backend of Math.NET used in this benchmark utilizes hardware acceleration like MKL, the answer is no. This benchmark uses the managed backend of Math.NET.

As for the interop overhead when using MKL or OpenBLAS as the backend for Math.NET, based on my observation of how such providers are implemented, I believe the overhead is minimal. It seems to involve only a few argument checks or copies during backend processing. Especially with larger matrices, the overhead should be almost negligible.

Of course, to be certain, proper benchmarking is required. However, investigating the interop overhead of Math.NET is not essential for the development of NumFlat, so I’d like to consider it out of scope here...