coreos / coreos-assembler

Tooling container to assemble CoreOS-like systems
https://coreos.github.io/coreos-assembler/
Apache License 2.0
330 stars 165 forks source link

Bump to Fedora 40 #3785

Closed jlebon closed 2 months ago

jlebon commented 2 months ago

Some of our upstream CIs (ostree, rpm-ostree) require cosa and FCOS to be on the same release. Ideally we'd fix that but there's details there and we want to move cosa anyway.

jlebon commented 2 months ago

Didn't test this at all. Let's see what CI says.

jlebon commented 2 months ago

openshift/release PR: https://github.com/openshift/release/pull/51370

jlebon commented 2 months ago

(Testing locally as well in parallel now.)

Let's also push a release and add a Quay.io tag before merging this.

dustymabe commented 2 months ago

Let's also push a release and add a Quay.io tag before merging this.

agree. Ideally we build the next stable with at least a similar base as to what testing was done with.

jlebon commented 2 months ago

Prow needs https://github.com/openshift/release/pull/51370.

jlebon commented 2 months ago

/retest

travier commented 2 months ago

/test ci/prow/images /test ci/prow/rhcos

openshift-ci[bot] commented 2 months ago

@travier: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

Use /test all to run all jobs.

In response to [this](https://github.com/coreos/coreos-assembler/pull/3785#issuecomment-2078876512): >/test ci/prow/images >/test ci/prow/rhcos Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
travier commented 2 months ago

/test images /test rhcos

jlebon commented 2 months ago

CoreOS CI hanging at the cosa fetch --strict step. Possibly something going wrong with supermin. Prow is timing out, likely because of the same issue but for some reason we're not getting any logs there.

jlebon commented 2 months ago

Seems related to virtio-serial writes from the guest side sometimes hanging for some reason. (I.e. writes to /dev/virtio-ports/cosa-cmdout.)

jlebon commented 2 months ago

CoreOS CI hanging at the cosa fetch --strict step.

OK, latest commit seems to have fixed it! Looked a bit through git log v8.1.3..v8.2.2 in QEMU to see if anything obvious pops out but didn't see anything.

dustymabe commented 2 months ago

since we have to run CI again maybe let's update: tests/containers/tang/Containerfile too.

jlebon commented 2 months ago

OK weird, debugging in the pod, it looks like Prow is still hitting the same hanging issue that I thought 7857488 (#3785) fixed. And even more fun, I can't get this hang to reproduce when running manually in the pod. So I think there's a race somewhere and the commit just made it less likely.

Anyway, this now sounds like possibly some bug when combining virtio-serial and stdio. I think I'll just rework this to use a regular serial device instead of virtio-serial since that's obviously way more battle-tested.

jlebon commented 2 months ago

OK, ran out of cycles trying to debug this. I've ended having to essentially revert 4eb19f46f, which is unfortunate. But at least it passes CI in both Prow and CoreOS CI.

I think I'll just rework this to use a regular serial device instead of virtio-serial

The problem with this is that it doesn't work on all arches. E.g. on aarch64, adding another --serial doesn't create a /dev/ttyAMA1 device.

jlebon commented 2 months ago

Have some work to try to create a minimal/self-contained reproducer to file a bug, but it's proving trickier than expected.

jlebon commented 2 months ago

Since CI already passed on this, let's just merge it in to unbreak CI and get to any other fallout faster.