adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
85 stars 101 forks source link

System unavailable: Various linux aarch64 machines #3096

Open adamfarley opened 1 year ago

adamfarley commented 1 year ago

The other two machines see the Java agent die quickly after re-enabling. Error code 126 (command found, but failed anyway).

One theory is that these machines are out of / low on memory. Can they be restarted please?

If this does not fix the issue, perhaps they should be re-initialised from scratch (re-provisioned/re-ansibled).

steelhead31 commented 1 year ago

These 2 are back online following an issue whereby the dockerhost became unreachable.

test-docker-centos8-armv8-1
test-docker-debian11-armv8-1

steelhead31 commented 1 year ago

I think the host key for these 2 servers has changed, and so Jenkins cant connect to them..
test-alibaba-ubuntu1804-armv8-1
test-alibaba-ubuntu1804-armv8-2

I don't have permissions to correct this.

steelhead31 commented 1 year ago

@Haroon-Khel do you have access or any contact for these 2 alibaba machines...?

sxa commented 8 months ago

@Haroon-Khel Did you hear back from Alibaba about the machines they're hosting for us?

sxa commented 3 months ago

I've removed the alibaba machines from jenkins. They can be added again if we obtain replacements.