Open chopikus opened 1 month ago
Hi @chopikus . Thank you for the detailed report.
I can reproduce the error also for NVIDIA GPUs. The problem is in the clBuildProgram
, once the code is generated. However, running on Intel GPUs with OpenCL works. It also works for the SPIR-V and PTX backends. We will take a look and analyze why NVIDIA and AMD are reporting a clBuildProgram
failure.
For reference, I added this test in this branch: https://github.com/beehive-lab/TornadoVM/commit/87e6080ad9c2e23ecb3b19479275b51c0ae61738
In general the most tested platforms for TornadoVM are NVIDIA discrete GPUs, and Intel GPUs (ARC and HD Graphics). The most supported backend is OpenCL. However, we are pushing for the SPIR-V. We hope in the future to be this the default one.
Regarding the FPGAs, we have tested on Intel Altera FPGAs and AMD/Xilinx FPGAs. My colleague Thanos can give you more details about the current support.
On Fri, 26 Jul 2024 at 15:21, Igor Chovpan @.***> wrote:
Thank you for a quick response!
@jjfumero https://github.com/jjfumero Which platform is the most stable for TornadoVM? Does this example work on FPGAs or on some hosting?
I really like this project and want to try it out more.
— Reply to this email directly, view it on GitHub https://github.com/beehive-lab/TornadoVM/issues/510#issuecomment-2252754083, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKX2BLX5Z4QXKGKNJBB7S3ZOJEMZAVCNFSM6AAAAABLPOQVTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJSG42TIMBYGM . You are receiving this because you were mentioned.Message ID: @.***>
Good day.
I hope you excuse me if I add my 5 cents by asking.
Recently, AMD released a new NPU, which will be supported by Xilinx RT and, in turn, will work over OpenCL. If I got it correctly, one API supports OpenCL too, so will not make the SPIR-V default approach narrow down the usage possibilities of TornadoVM?
Hi @andrii0lomakin , OpenCL >= 2.1 can dispatch SPIR-V binary kernels. In fact, TornadoVM currently dispatches SPIR-V with both, OpenCL runtime and Level Zero API from oneAPI. We hope vendors in the future use more SPIR-V. From my view, the way to go is SPIR-V and PTX. However, debugging the compiler gets increasingly complex.
As it is now the vendors/accelerators landscape, it is difficult to deprecate our OpenCL C backend. Just a few examples: FPGA vendors support OpenCL 1.0 - 1.2. Apple supports OpenCL 1.2. Thus, if TornadoVM wants to run also on those platforms, the OpenCL C is still needed. Unless, of course, there are new backends (e.g., for VHDL directly, Apple Metal, etc).
Describe the bug
Running the example program produces an error for big enough arrays.
Program:
Running
mvn package
andtornado -jar [JARFILE]
produces an error:Memory access fault by GPU node-1 (Agent handle: 0x7fe80076f230) on address 0x7fe614c00000. Reason: Page not present or supervisor privilege.
However if I change the size of
array
to1023*8
instead of1024*8
the error is gone.How To Reproduce
I put my code into a repository: https://github.com/chopikus/my-tornado-app.
Steps:
git clone https://github.com/chopikus/my-tornado-app.git
cd my-tornado-app
./run.sh
Expected behavior
No errors should be produced
Computing system setup (please complete the following information):
tornado --version
:version=1.0.7-dev, branch=master, commit=96b3040
;Backends installed: opencl
tornado -version
:java version "21.0.4" 2024-07-16 LTS; Java(TM) SE Runtime Environment (build 21.0.4+8-LTS-274); Java HotSpot(TM) 64-Bit Server VM (build 21.0.4+8-LTS-274, mixed mode)
Additional context
tornado --devices
: