beehive-lab / TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
https://www.tornadovm.org
Apache License 2.0
1.17k stars 110 forks source link

Add Windows support to automatic installation script #371

Closed otabuzzman closed 4 months ago

otabuzzman commented 4 months ago

Description

The PR adds Windows support to the bin/tornadovm-installer automatic installation script. The documentation (readthedocs) has been updated accordingly.

Problem description

n/ a.

Backend/s tested

Mark the backends affected by this PR.

OS tested

Mark the OS where this PR is tested.

See results summary of test suite below.

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

How to test the new patch?

On a Windows box:


Results summary (failures) of TornadoVM test suite.

Note: The suite was run twice consecutively on each configuration, with TornadoVM previously compiled for a single specific backend.

Windows 11 Home on Lenovo IdeaPad

OpenCL backend, 1st run

  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

OpenCL backend, 2nd run

  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

PTX backend, 1st run

  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

PTX backend, 2nd run

  Running test: testCopyInWithDevice       ................  [FAILED]
          \_[REASON] expected:<452661.0> but was:<513877.0>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

SPIR-V backend, 1st run

        Running test: testReduction01            ................  [FAILED]
                \_[REASON] expected:<-966107475> but was:<-464861690>
        Running test: testIrregularSize03        ................  [FAILED]
                \_[REASON] expected:<336.0173> but was:<337.83768>
        Running test: testCopyInWithDevice       ................  [FAILED]
                \_[REASON] expected:<114088.0> but was:<100620.0>
        Running test: testBatchNotEven           ................  [FAILED]
                \_[REASON] expected:<5000000.0> but was:<0.0>
        Running test: test50MBInteger            ................  [FAILED]
                \_[REASON] Unable to allocate 50000024 bytes of memory.
        Running test: test04                     ................  [FAILED]
                \_[REASON] Unable to compile task s0.t0 - badCascadeKernel4

SPIR-V backend, 2nd run

        Running test: testReduction01            ................  [FAILED]
                \_[REASON] expected:<1533689219> but was:<1584861366>
        Running test: testIrregularSize03        ................  [FAILED]
                \_[REASON] expected:<424.82053> but was:<427.6317>
        Running test: testCopyInWithDevice       ................  [FAILED]
                \_[REASON] expected:<108992.0> but was:<95368.0>
        Running test: testBatchNotEven           ................  [FAILED]
                \_[REASON] expected:<5000000.0> but was:<0.0>
        Running test: test04                     ................  [FAILED]
                \_[REASON] Unable to compile task s0.t0 - badCascadeKernel4

Amazon Linux on EC2 (g4dn.xlarge)

OpenCL backend, 1st run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<6.576169967651367>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

OpenCL backend, 2nd run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<6.476556777954102>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

PTX backend, 1st run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<6.341561794281006>
  Running test: testCopyInWithDevice       ................  [FAILED]
          \_[REASON] expected:<163063.0> but was:<138806.0>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

PTX backend, 2nd run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<7.097877502441406>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

Windows Server 2022 on EC2 (g4dn.xlarge)

Note: Missing SSL certificates required manual Maven download.

OpenCL backend, 1st run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<3.9345362186431885>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

OpenCL backend, 2nd run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<5.727963447570801>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>

PTX backend, 1st run

  Running test: testCopyInWithDevice       ................  [FAILED]
          \_[REASON] expected:<155511.0> but was:<132241.0>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

PTX backend, 2nd run

  Running test: testComputePi              ................  [FAILED]
          \_[REASON] expected:<3.14> but was:<4.928632736206055>
  Running test: testCopyInWithDevice       ................  [FAILED]
          \_[REASON] expected:<152087.0> but was:<178420.0>
  Running test: testBatchNotEven           ................  [FAILED]
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: testTornadoMathSinPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.
  Running test: testTornadoMathCosPIDouble ................  [FAILED]
          \_[REASON] Bailout is disabled.

macOS Sonoma on Macbook Air (2015)

Note: Running the test suite required to remove enable assertions option in test target in Makefile. Probably due to Sonoma compatibility issues on unsupported MacBook Air (2015). Assert in question is in .../uk/ac/manchester/tornado/runtime/interpreter/TornadoVMInterpreter.java (device != null).

OpenCL backend, 1st run

  Running test: test06                     ................  [FAILED] 
          \_[REASON] Bailout is disabled. 
  Running test: testBatchNotEven           ................  [FAILED] 
          \_[REASON] expected:<5000000.0> but was:<0.0>
  Running test: test512MB                  ................  [FAILED] 
          \_[REASON] Unable to allocate 512000024 bytes of memory.

OpenCL backend, 2nd run

  Running test: test06                     ................  [FAILED] 
          \_[REASON] Bailout is disabled. 
  Running test: test512MB                  ................  [FAILED] 
          \_[REASON] Unable to allocate 512000024 bytes of memory.
  Running test: testBatchNotEven           ................  [FAILED] 
          \_[REASON] expected:<5000000.0> but was:<0.0>
CLAassistant commented 4 months ago

CLA assistant check
All committers have signed the CLA.

jjfumero commented 4 months ago

Thank you @otabuzzman for this patch. We will take a look in the following days. In the meantime, it seems some of the commits were done using another account in which the CLA assistant detects that it has not been signed.

Regarding the errors in the unit-tests. some of them are expected and it depends on the driver user. But we will take a closer look as well.

otabuzzman commented 4 months ago

I issued the commit in question from my Macbook. There is another Git configuration that incorrectly uses my real name instead of otabuzzman. I will fix the problem when I have access to my MacBook on April 13th. However, fixing the configuration will not change the commit in this PR. Should I create a new one and you reject this PR?

Am Sa., 6. Apr. 2024 um 07:34 Uhr schrieb Juan Fumero < @.***>:

Thank you @otabuzzman https://github.com/otabuzzman for this patch. We will take a look in the following days. In the meantime, it seems some of the commits were done using another account in which the CLA assistant detects that it has not been signed.

Regarding the errors in the unit-tests. some of them are expected and it depends on the driver user. But we will take a closer look as well.

— Reply to this email directly, view it on GitHub https://github.com/beehive-lab/TornadoVM/pull/371#issuecomment-2040975947, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7PMXEOSQ5HTABYBNNE7GDY36COFAVCNFSM6AAAAABFZVBGE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBQHE3TKOJUG4 . You are receiving this because you were mentioned.Message ID: @.***>

jjfumero commented 4 months ago

Hi Jürgen. I understand. Unfortunately, we can't merge if the CLA is not signed for all commits. So my take here, as you mentioned, is to open a new PR with the diff using the account/email that you have already used for the CLA for the previous merges. Sorry for the inconvenience.

From our side, half of the team will be in a conference from mid this week, so we could start review next week too.

otabuzzman commented 4 months ago

No problem. I‘ll fix the commit and open a new PR. Have fun on your conference. Regards, Jürgen.

Juan Fumero @.***> schrieb am Mo. 8. Apr. 2024 um 08:57:

Hi Jürgen. I understand. Unfortunately, we can't merge if the CLA is not signed for all commits. So my take here, as you mentioned, is to open a new PR with the diff using the account/email that you have already used for the CLA for the previous merges. Sorry for the inconvenience.

From our side, half of the team will be in a conference from mid this week, so we could start review next week too.

— Reply to this email directly, view it on GitHub https://github.com/beehive-lab/TornadoVM/pull/371#issuecomment-2041994538, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7PMXBIYB4JWHRWMKJM5VLY4I5UDAVCNFSM6AAAAABFZVBGE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBRHE4TINJTHA . You are receiving this because you were mentioned.Message ID: @.***>

jjfumero commented 4 months ago

I have tested on Windows 11 with the OpenCL backend and it works perfectly. Thank you, this is great work.

The tests that fail, many of them are expected, especially those related to SPIR-V and PTX. Thank you for the report too. On Linux, many of these are passing.

jjfumero commented 4 months ago

I got an error when running the native test:

==================================================
              Unit tests report
==================================================

{'[PASS]': 610, '[FAILED]': 10, '[UNSUPPORTED]': 45}
Coverage [PASS/(PASS+FAIL)]: 98.39%
Coverage [PASS/(PASS+FAIL+UNSUPPORTED)]: 91.73%

==================================================

Total Time(s): 179.11767625808716

NMAKE : fatal error U1077: 'python %TORNADO_SDK%\bin\tornado-test --ea --verbose' : return code '0x1'
Stop.Stop.
jjfumero commented 4 months ago

Linux and OSx also work.

otabuzzman commented 4 months ago

I got an error when running the native test:

==================================================
              Unit tests report
==================================================

{'[PASS]': 610, '[FAILED]': 10, '[UNSUPPORTED]': 45}
Coverage [PASS/(PASS+FAIL)]: 98.39%
Coverage [PASS/(PASS+FAIL+UNSUPPORTED)]: 91.73%

==================================================

Total Time(s): 179.11767625808716

NMAKE : fatal error U1077: 'python %TORNADO_SDK%\bin\tornado-test --ea --verbose' : return code '0x1'
Stop.Stop.

I think this is expected behavior: The script tornado-test returns 0x1 if any error occurs that is not whitelisted. The nmake program considers return values != 0 errors. I therefore guess that at least one of the 10 tests that failed in the above report is not whitelisted.

The same goes for make. I checked it on Linux with a PTX backend that has not whitelisted errors and the behavior is the same:

==================================================
              Unit tests report
==================================================

{'[PASS]': 566, '[FAILED]': 9, '[UNSUPPORTED]': 56}
Coverage [PASS/(PASS+FAIL)]: 98.43%
Coverage [PASS/(PASS+FAIL+UNSUPPORTED)]: 89.7%

==================================================

Total Time(s): 254.25732254981995

make: *** [tests] Error 1
[ec2-user@ip-172-31-18-17 TornadoVM]$
otabuzzman commented 4 months ago

LGMT. I tested it in Windows, mac and Linux. The only comment is if we deprecate the .\bin\tornadovm-installer.cmd, to also update the installation.rst file.

I think it would be ok to deprecate .\bin\tornadovm-installer.cmd. The original automatic installer now works (almost) the same on all supported operating systems. The respective paragraph in installation.rst could therefore be removed.

Question: Have you tried also the MSYS2 workflow?

Good and obvious question. I didn't think of it and thus didn't test it with MSys2. Should I? The documentation does not foresee running bin\tornadovm-installer in the MSYS2 workflow.

stratika commented 4 months ago

LGMT. I tested it in Windows, mac and Linux. The only comment is if we deprecate the .\bin\tornadovm-installer.cmd, to also update the installation.rst file.

I think it would be ok to deprecate .\bin\tornadovm-installer.cmd. The original automatic installer now works (almost) the same on all supported operating systems. The respective paragraph in installation.rst could therefore be removed.

Yes, I feel the same. Let's remove it to keep it simple. Can you please include this change in the PR?

Question: Have you tried also the MSYS2 workflow?

Good and obvious question. I didn't think of it and thus didn't test it with MSys2. Should I? The documentation does not foresee running bin\tornadovm-installer in the MSYS2 workflow.

That's fine. I spoke with @jjfumero and we believe that MSYS2 installation and working flow should be deprecated, since we have a native workflow working.

Thank you very much @otabuzzman, once the script is removed we can proceed to merge it.

otabuzzman commented 4 months ago

... (text removed) Yes, I feel the same. Let's remove it to keep it simple. Can you please include this change in the PR?

Sure. Done.

... (text removed) That's fine. I spoke with @jjfumero and we believe that MSYS2 installation and working flow should be deprecated, since we have a native workflow working.

I think that the section on manual Windows installation is worth keeping in an adapted version, and provided a proposal (my last commits). I removed the section on Windows issues as well since they were all related to MSYS2.