coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
260 stars 60 forks source link

[rawhide][x86_64] fcos.ignition.* kola tests time out on multiple cloud providers #1630

Closed aaradhak closed 4 months ago

aaradhak commented 7 months ago

Describe the bug

fcos.ignition.v3.noop & fcos.ignition.misc.empty kola tests seem to fail in the latest rawhide builds because of TIMEOUT on cloud providers - kola-aws, kola-gcp, kola-azure & kola-openstack. Assuming this to be because of the introduction of a new piece of software in rawhide as the failure is observed on multiple cloud providers.

kola test failure:

[2023-12-11T15:38:05.263Z] --- FAIL: fcos.ignition.v3.noop (1171.66s)
[2023-12-11T15:38:05.263Z]         harness.go:106: TIMEOUT[10m0s]: SSH unsuccessful within allotted timeframe for i-0c395e06c73deaa42.
[2023-12-11T15:38:05.516Z] --- FAIL: fcos.ignition.misc.empty (1172.28s)
[2023-12-11T15:38:05.516Z]         harness.go:106: TIMEOUT[10m0s]: SSH unsuccessful within allotted timeframe for i-06519bef975212419.
[2023-12-11T15:38:05.516Z] FAIL, output in /home/jenkins/agent/workspace/kola-aws/tmp/kola-VSKJl/kola/rerun
[2023-12-11T15:38:06.070Z] Error: harness: test suite failed
[2023-12-11T15:38:06.070Z] 2023-12-11T15:38:05Z cli: harness: test suite failed
[2023-12-11T15:38:06.070Z] failed to execute cmd-kola: exit status 1

Reproduction steps

Run fcos.ignition.v3.noop & fcos.ignition.misc.empty on latest rawhide x86_64 cloud providers - aws, azure, gcp, openstack

Expected behavior

fcos.ignition.v3.noop & fcos.ignition.misc.empty tests to pass.

Actual behavior

fcos.ignition.v3.noop & fcos.ignition.misc.empty tests fail

System details

[rawhide][x86_64] - 40.20231211.91.0 - aws, azure, gcp, openstack

Butane or Ignition config

No response

Additional information

No response

dustymabe commented 7 months ago

So one thing is that on platforms that support it the framework will set both a key in Ignition and one in the metadata service (i.e. that gets set via Afterburn) unless it's told not to. Here is where it is getting set for afterburn for Azure. For the fcos.ignition.v3.noop and fcos.ignition.misc.empty tests the framework won't set the key for Ignition, so we are relying on the afterburn set SSH authorized keys entry (which is essentially what the purpose of these tests are).

One possibility is that the SSH key getting created/set is now using tech that is no longer allowed with the crypto policies in rawhide.

gursewak1997 commented 6 months ago

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2254975

gursewak1997 commented 5 months ago

Waiting on https://github.com/fedora-selinux/selinux-policy/pull/2000

HuijingHei commented 4 months ago

Close this as fixed by https://koji.fedoraproject.org/koji/buildinfo?buildID=2401875 and https://github.com/coreos/fedora-coreos-config/pull/2853