TornadoVM can assign big blocks of threads depending on the driver. This can increase performance in some platforms (e.g., NVIDIA GH200) using the NVIDIA Driver 545.X.
Problem description
n/ a.
Backend/s tested
Mark the backends affected by this PR.
[X] OpenCL
[ ] PTX
[ ] SPIRV
OS tested
Mark the OS where this PR is tested.
[X] Linux
[ ] OSx
[ ] Windows
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
[ ] Yes
[X] No
How to test the new patch?
make
make tests
tornado --threadInfo -m tornado.examples/uk.ac.manchester.tornado.examples.compute.MatrixMultiplication2D
Description
TornadoVM can assign big blocks of threads depending on the driver. This can increase performance in some platforms (e.g., NVIDIA GH200) using the NVIDIA Driver 545.X.
Problem description
n/ a.
Backend/s tested
Mark the backends affected by this PR.
OS tested
Mark the OS where this PR is tested.
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
How to test the new patch?