adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
85 stars 101 forks source link

Only 2 test windows nodes available, 8 offline #1605

Closed andrew-m-leonard closed 3 years ago

andrew-m-leonard commented 3 years ago

https://ci.adoptopenjdk.net/label/ci.role.test&&hw.arch.x86&&sw.os.windows/ Presumably a lot of these should be online?

sxa commented 3 years ago

❌ test-aws-win2019-x64-1 (Offline due to having <10GB space free) test-aws-win2019-x64-2 (Marked offline due to https://github.com/AdoptOpenJDK/openjdk-infrastructure/issues/1602) ✓ test-azure-win2012r2-x64-1 ❌ test-godaddy-win2016-x64-1 ❌ test-godaddy-win2016-x64-2 ❌ test-godaddy-win2016-x64-3 ❌ test-godaddy-win2016-x64-4 ❌ test-ibmcloud-win2012r2-x64-1 (New machine - in process by @Willsparker) ❌ test-ibmcloud-win2012r2-x64-2 (New machine - in process by @Willsparker) ❓ test-packet-win2012r2-x64-1 ✓ test-softlayer-win2012r2-x64-1 ❌ test-softlayer-win2012r2-x64-2

sxa commented 3 years ago

Restarted test-godaddy-win2016-x64-1 test-godaddy-win2016-x64-2 test-godaddy-win2016-x64-4 and they are all running jobs fromthe queue. test-godaddy-win2016-x64-3 appears to have vanished so i can look at recreating that one later.

sxa commented 3 years ago

test-softlayer-win2012r2-x64-2 was reporting as low on space but now has around 44Gb on the drive (I didn't clear anything up so I'm guessing it wasn't brought back online after https://github.com/AdoptOpenJDK/openjdk-infrastructure/issues/1590 was closed) so I've brought it back online again and it's now running a Grinder from the queue

sxa commented 3 years ago

AWS machine 2 now back online after clearing up RAM (see #1602) used by a lot of rogue java processes

sxa commented 3 years ago

Updated status: ✓test-aws-win2019-x64-1 ✓test-aws-win2019-x64-2 ✓ test-azure-win2012r2-x64-1 ✓ test-godaddy-win2016-x64-1 ✓ test-godaddy-win2016-x64-2 ~x test-godaddy-win2016-x64-3~ System no longer exists - can be recreated if capacity needed ✓ test-godaddy-win2016-x64-4 x test-ibmcloud-win2012r2-x64-1 (New machine - not yet ready) x test-ibmcloud-win2012r2-x64-2 (New machine - not yet ready) x test-packet-win2012r2-x64-1 ✓ test-softlayer-win2012r2-x64-1 ✓ test-softlayer-win2012r2-x64-2

So we are now up to 3 offline (two of which are new ones being setup to replace another two existing ones) and 8 online. I'm going to close this as capacity is now adequate. Having said that, we have disk space issues since the Windows systems are using over 80Gb after the playbooks have been run and several of the machines are configured with 100GB disks which is causing them to go offline after the new "10GB free" requirement that we've implemented in jenkins recently