adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
85 stars 101 forks source link

System unavailable: build-alibaba-win2012r2-x64-[12] #1818

Closed sxa closed 3 years ago

sxa commented 3 years ago

This will prevent alibaba windows builds working as they are currently tied to these machines.

Haroon-Khel commented 3 years ago

Ive created a symlink in cygwin/bin/ which links to Program Files\CMake\bin. Re running at https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/911/console

Haroon-Khel commented 3 years ago

I think it may have found cmake, but is hitting a different error now

13:45:33  CMake Error: Error processing file: /cygdrive/c/Jenkins/temp/workspace/build/src/openj9/runtime/cmake/caches/win_x86-64_cmprssptrs.cmake
13:45:33  CMake Error: The source directory "/cygdrive/c/Jenkins/temp/workspace/build/src/openj9" does not exist.
13:45:33  Specify --help for usage, or press the help button on the CMake GUI.
13:45:33  loading initial cache file /cygdrive/c/Jenkins/temp/workspace/build/src/openj9/runtime/cmake/caches/win_x86-64_cmprssptrs.cmake
13:45:33  make[3]: *** [/cygdrive/c/Jenkins/temp/workspace/build/src/closed/OpenJ9.gmk:414: 
sxa commented 3 years ago

The 12:25:49 /usr/bin/bash: /cygdrive/c/Program: No such file or directory suggests that its having trouble with the spacing in windows directory names

The 12:25:49 /usr/bin/bash: /cygdrive/c/Program: No such file or directory suggests that its having trouble with the spacing in windows directory names

The question is: What's different between how it's being picked up on the alibaba machines vs the Azure/IBMCloud ones?

Haroon-Khel commented 3 years ago

On build-azure-win2012r2-x64-1, cmake is in cygwin64/bin/cmake.exe. But I think the symlink I created managed to solve the spaced directory problem. We're now getting a Cmake error, which I have posted above. The azure machine machine's cmake is version 3.14.5, while the alibaba's is 3.7.2 the version that ansible installs

sxa commented 3 years ago

On build-azure-win2012r2-x64-1, cmake is in cygwin64/bin/cmake.exe. But I think the symlink I created managed to solve the spaced directory problem. We're now getting a Cmake error, which I have posted above

So where did the cmake.exe one come from oin the machines that work - is it a cygwin one as opposed to one from a separate cmake installation?

sxa commented 3 years ago

OK This is rather odd ... The cmake role seems to check for its presence in c:\cygwin64\bin\cmake.exe and then use that to decide whether to install the one from cmake.org which won't update C:\cygwin64\bin\cmake.exe. What a mess...

@AdamBrousseau Do the OpenJ9 Windows machines use cmake from cygwin or from cmake.org? If the former (and assuming Haroon confirms how the other machines are set up - cmake --version will hopefully help figure it out) then we should possibly standardise on that version and remove the cmake role.

Haroon-Khel commented 3 years ago

So where did the cmake.exe one come from oin the machines that work - is it a cygwin one as opposed to one from a separate cmake installation?

Im not too sure, but I would assume that it came with cygwin since its in the cygwin directory. When I manually had to use the installer to install cmake on alibaba, its default installation directory was Program Files\CMake, therefore I assume the way our cmake role works is that it checks in cygwin64\bin\ for a prepackaged cmake, else it installs it separately in Program Files

Haroon-Khel commented 3 years ago

Yeah until now I assumed that the cmake role was responsible for installing cmake in cygwin64\bin\cmake. I guess not

Haroon-Khel commented 3 years ago

Im going to install cmake on alibaba-2 and see if I get the same error

sxa commented 3 years ago

I assume the way our cmake role works is that it checks in cygwin64\bin\ for a prepackaged cmake, else it installs it separately in Program Files

While that looks like it's what it's doing it doesn't seem particuarly sensible since it'll be repeatedly trying to install the seperate one in the case where there isn't a copy in c:\cygwin64\bin - it shouldn't be trying to reinstall the separate cmake if it's already there under C:\Program Files

Haroon-Khel commented 3 years ago

I just ran the cmake role on -2. It didnt install it in either Program Files or cygwin\bin, depsite it saying changed. Thats confusing

TASK [Download cmake installer] *********************************************************************************************************************************************************************************************************************************************************
task path: /Users/hkhel/AdoptOpenJDK/openjdk-infrastructure/ansible/playbooks/AdoptOpenJDK_Windows_Playbook/roles/cmake/tasks/main.yml:12
changed: [8.208.87.18] => {"changed": true, "checksum_dest": "8b0cbfc6be83e31a058c8ef282fe204862809ffcd8788bc19a8f0eb457f71187", "checksum_src": "8b0cbfc6be83e31a058c8ef282fe204862809ffcd8788bc19a8f0eb457f71187", "dest": "C:\\temp\\cmake.msi", "elapsed": 3.1962816, "msg": "OK", "size": 18235900, "status_code": 200, "url": "https://cmake.org/files/v3.7/cmake-3.7.2-win64-x64.msi"}

TASK [Install cmake] ********************************************************************************************************************************************************************************************************************************************************************
task path: /Users/hkhel/AdoptOpenJDK/openjdk-infrastructure/ansible/playbooks/AdoptOpenJDK_Windows_Playbook/roles/cmake/tasks/main.yml:22
changed: [8.208.87.18] => {"changed": true, "rc": 0, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
sxa commented 3 years ago

You'll need to go onto the machine and try and install it manually using the command in the playbook and see what happens and/or search the whole machine for cmake.exe to see if the playbook has put it anywhere, but the point I was making before is whether it's even used at all on the other machines or if they are actually using cmake from cygwin for the OpenJ9 builds.

Haroon-Khel commented 3 years ago

Both build-azure machines use 3.14.5, while both build-ibmcloud machines use 3.17.3. Both use cygwin's cmake

Haroon-Khel commented 3 years ago

Before your comment, I installed cmake on -2 via the msi. The same error appeared on -2 https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/914/console

sxa commented 3 years ago

Before your comment, I installed cmake on -2 via the msi. The same error appeared on -2 https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/914/console

Is that different from what happened on the machine before you installed cmake from the MSI? If every other machine is using the cmake from cygwin I'm not sure we have a requirement for the other one

Haroon-Khel commented 3 years ago

Is that different from what happened on the machine before you installed cmake from the MSI?

Yes. Before installing it, there wasnt a cmake on the machine so would give a cmake not found error. I installed it via the msi on -2 just to see if I could recreate the error. It can always be uninstalled.

In terms of next steps, the only thing I can think of is to reinstall cygwin using the playbooks to get the cygwin cmake

sxa commented 3 years ago

Yes. Before installing it, there wasnt a cmake on the machine so would give a cmake not found error. I installed it via the msi on -2 just to see if I could recreate the error. It can always be uninstalled.

Hmmm - has it also added itself to the system PATH? We don't add that directory during the build scripts

sxa commented 3 years ago

In terms of next steps, the only thing I can think of is to reinstall cygwin using the playbooks to get the cygwin cmake

I don't think it needs a reinstall - from memory you can add packages by repliacting the command-line parameters to the cygwin installer like https://github.com/AdoptOpenJDK/openjdk-infrastructure/blob/81d61d27006e6832c063905c80421a6ca3cd0db9/ansible/playbooks/AdoptOpenJDK_Windows_Playbook/roles/cygwin/tasks/main.yml#L20

Or worst case you just re-run the installer and select the new packages.

Haroon-Khel commented 3 years ago
Screenshot 2021-02-04 at 15 56 14

Using the installer, there isnt an option to install cmake as a package

Haroon-Khel commented 3 years ago

Even the command line arguments you posted doesnt include cmake. I can only assume it comes with the cygwin package from the checklist I posted?

sxa commented 3 years ago

Using the installer, there isnt an option to install cmake as a package

Hmmm that's a bit odd. Since we have it on the others machines it must havecome from somewhere

Even the command line arguments you posted doesnt include cmake. I can only assume it comes with the cygwin package from the checklist I posted?

It's possible it disappeared in the changes made to speed up the cygwin installs a few months ago in which case it needs to be added back in. Although that shoulkd have been picked up by VagrantPlaybookCheck (NOTE: It's just about possible we haven't run a JDK11/J9 build on VPC since it was added I suppose)

Haroon-Khel commented 3 years ago

It's possible it disappeared in the changes made to speed up the cygwin installs a few months ago in which case it needs to be added back in. Although that shoulkd have been picked up by VagrantPlaybookCheck (NOTE: It's just about possible we haven't run a JDK11/J9 build on VPC since it was added I suppose)

I checked for changes in the cygwin role, I dont think that cmake was ever in the arguments list

On alibaba-1, I tried uninstalling cygwin by removing the cygwin directory. This partially worked; some files were prevented from being deleted due to a permission error eventhough I tried to delete the directory as the admin user. Nonetheless, I ran the playbook's cygwin role onto the machine which ran fine. Cygwin installed itself in the C:\cygwin64 directory, independent of the existing cygwin directory. The only problem is that this didnt come with a cmake install, so I do not know from where cmake is installed on the azure or ibmcloud machines

Haroon-Khel commented 3 years ago

Like before, Ive created a symlink for the Program Files\CMake in the cygwin64\bin directory so jenkins can use it, but like before it will likely result in a CMake error

Haroon-Khel commented 3 years ago

Update: https://github.com/AdoptOpenJDK/openjdk-infrastructure/pull/1958 allows for cmake to get installed alongside cygwin. Cmake is now on both alibaba machines. openj9 jdk11 job passed on -2 https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/929/console

The same job is now running on -1 https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-windows-x64-openj9/933/console

Openssl has been updated from i to j on both machines too

sxa commented 3 years ago

Build seems to have worked ok :-)

sxa commented 3 years ago

I've added the build tag back onto -1 so it should get used for tonight's builds

sxa commented 3 years ago

@Haroon-Khel Is there any outstanding work to be done here?

Haroon-Khel commented 3 years ago

This issue can be closed. The cmake issue was the last missing thing for these machines. Thats now been resolved