adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
86 stars 101 forks source link

EPIC: New Machine requirement: Replacement for Equinix x64 servers #3292

Closed sxa closed 5 months ago

sxa commented 11 months ago

I need to request a new machine:

Please explain what this machine is needed for: Equinix have been sponsoring our infrastructure by providing a generous amount of capacity for the Adoptium infrastructure. This is now coming to and end and we need to make a plan for migrating our systems away from Equinix (Note: This does not affect the aarch64 Altras which are supplied as part of the Works On Arm project, but are hosted by Equinix)

This will involve migration of the following systems:

145.40.115.43 - VMware ESXi server (n3.large.x86 - London DC) and 147.28.133.218 - VMware ESXi server (m3.large.x86 - Paris DC)

These host our Solaris/x64 systems (including for Temurin Compliance) as well as other Linux VMs for performance test work on x64

London (145.40.x115.xx):

Paris (147.28.133.2xx):

dockerhost-equinix-ubuntu2004-x64-1 (AMD EPYC 7401P 24 core/48 thread - c2.medium.x86 - London DC)

Used for builds and hosting many containers:

dockerhost-equinix-ubuntu2204-x64-1 (Intel Xeon Gold 40 core - n3.xlarge.x86 - London DC)

Used for builds and hosting many containers

C3AWX - (c3.small.x86 - Amsterdam DC)

Hosts our AWX instance and also the c3jenkins agent used for intermediate work on jenkins pipelines (Replacement for the "Built in Node")

Issues for individual systems:

sxa commented 10 months ago

Noting for C3awx:

sxa commented 10 months ago

i'm going to drop the number of executors on the two dockerhost x64 machiens to 1 for now to see if that causes any additional delays. Some of the jobs do use multiple executors as per the screnshot below but those arne't time critical jobs (unless thye hold up others) image

sxa commented 8 months ago

Current status:

Old host New Host
build-equinix_esxi-solaris10-x64-1 build-skytap-solaris10-x64-1
test-equinix_esxi-solaris10-x64-1 test-azure-solaris10-x64-1
test-equinix_esxi-ubuntu2204-x64-1 Not replaced
test-equinix_esxi-ubuntu2204-x64-1 Not replaced
jck-equinix_esxi-ubuntu2204-x64-1 In progress
jck-equinix_esxi_containerized-alpine317-x64-1 To be hosted on esxi-ubuntu2204
jck-equinix_esxi-solaris10-x64-1 In progress
dockerhost-equinix-ubuntu2004-x64-1 dockerhost-skytap-ubuntu2204-x64-1
dockerhost-equinix-ubuntu2204-x64-1 dockerhost-azure-ubuntu2204-x64-1
C3jenkins jenkins-worker

Also for decommission at Equinix:

sxa commented 7 months ago

Both equinix dockerhost x64 systems have now been shut down and the containers on them removed from jenkins.

sxa commented 6 months ago

c3-awx and vmware-esxi7 have been shut down and removed from the inventory (left for a few days just to ensure nothing is using them.

This only leaves vmware-esxi7-2 which hosts some of the Temurin Compliance systems and can go once @fredg02 confirms that we have the replacement Solaris boxes live and working on there

steelhead31 commented 5 months ago

jck-equinix-solaris-x64-1 has now been shutdown and replaced.

sxa commented 5 months ago

OK so we just have to confirm that the other two esxi jck machines are not required. I thought they'd been shut off during the last cycle to confirm that but I'll kick off some jobs to verify.

vielmetti commented 5 months ago

Thanks to @sxa and the whole team for their hard work on this one.

sxa commented 5 months ago

vmware-esxi7-2 has been deleted from the equinix portal. We are finally done! Closing.