coreos / coreos-assembler

Tooling container to assemble CoreOS-like systems
https://coreos.github.io/coreos-assembler/
Apache License 2.0
347 stars 168 forks source link

`coreos-assembler build` failing for RHCOS during `anaconda` phase #325

Closed miabbott closed 5 years ago

miabbott commented 5 years ago

Edit: The original error below is just a symptom of an error earlier in the build phase - https://github.com/coreos/coreos-assembler/issues/325#issuecomment-461174135


Using the latest build from quay.io[1], myself and @yuqi-zhang are observing the following error when trying to build RHCOS from the internal repo:

libguestfs: error: download: /boot/loader/grub.cfg: No such file or directory

The new instructions for setting up the working directory were followed and builds were done in the context of coreos-assembler shell.

[1] see below:

$ jq -C . /cosa/coreos-assembler-git.json 
{
  "date": "2019-02-06T17:42:44Z",
  "git": {
    "commit": "cfc9e0c39f9bd80f9f698d2274bd300308d087d7",
    "origin": "git@github.com:coreos/coreos-assembler.git",
    "dirty": "false"
  },
  "file": {
    "checksum": "7933e29e82e4c8069829379efde582de8a927e8074743fe400013a4b9ce3a7c3",
    "checksum_type": "sha256",
    "format": "tar.gz",
    "name": "coreos-assembler-git.tar.gz",
    "size": "4860593"
  }
}
cgwalters commented 5 years ago

That's the "anaconda failed" error. There should be more useful logs in tmp/build/anaconda or so.

yuqi-zhang commented 5 years ago

I think the relevant errors are:

19:47:28,980 INF program: Installing for i386-pc platform.
19:47:28,980 INF program: grub2-install: error: unknown filesystem.
19:47:28,981 DBG program: Return code: 1
19:47:28,983 INF program: Running in chroot '/mnt/sysimage/ostree/deploy/redhat-coreos-maipo/deploy/3ac79e869c2d42e413ad914ca985a1c520c561b6145c757f00b52db7c7bf8d2b.0'... grub2-mkconfig -o /boot/grub2/grub.cfg
19:47:29,336 INF program: Generating grub configuration file ...
19:47:29,337 INF program: /usr/sbin/grub2-probe: error: unknown filesystem.

and

19:47:29,338 INF threading: Thread Failed: AnaInstallThread (140569969612544)
19:47:29,338 DBG exception: running handleException
19:47:29,340 CRT exception: Traceback (most recent call last):

  File "/usr/lib64/python3.7/site-packages/pyanaconda/bootloader.py", line 1656, in write
    self.install()

  File "/usr/lib64/python3.7/site-packages/pyanaconda/bootloader.py", line 1641, in install
    raise BootLoaderError("boot loader install failed")

pyanaconda.bootloader.BootLoaderError: boot loader install failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/usr/lib64/python3.7/site-packages/pyanaconda/bootloader.py", line 2511, in writeBootLoaderFinal
    storage.bootloader.write()

  File "/usr/lib64/python3.7/site-packages/pyanaconda/bootloader.py", line 1660, in write
    self.write_config()

  File "/usr/lib64/python3.7/site-packages/pyanaconda/bootloader.py", line 1613, in write_config
    raise BootLoaderError("failed to write boot loader configuration")

pyanaconda.bootloader.BootLoaderError: failed to write boot loader configuration
miabbott commented 5 years ago

Attaching install.log and program.log

install.log

program.log

cgwalters commented 5 years ago

OK reproduced this here.

dustymabe commented 5 years ago

we did just switch to f29 (#312) - maybe related to that?

cgwalters commented 5 years ago

we did just switch to f29 (#312) - maybe related to that?

Yep that's the root cause.

yuqi-zhang commented 5 years ago

I see this using a container built from Dockerfile.rhel as well though:

cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
...
cgwalters commented 5 years ago

I see this using a container built from Dockerfile.rhel as well though:

The root cause of this is the mkfs.xfs in the f29 anaconda environment creates newer versions that the RHEL7 kernel doesn't understand. So switching the cosa container to rhel7 doesn't help, the installer also needs to be switched.

(This is a big thing that will go away with "no anaconda", we'll only have one userspace (the container) building things)

miabbott commented 5 years ago

The root cause of this is the mkfs.xfs in the f29 anaconda environment creates newer versions that the RHEL7 kernel doesn't understand.

I believe that problem looks like this?

19:50:05,976 INF program: Running in chroot '/mnt/sysimage/ostree/deploy/redhat-coreos-maipo/deploy/94b6298fd8a579b48a0a3beec933194d069371801c3c2ee859791a3bdf430574.0'... grub2-install --no-floppy /dev/vda                                                                         
698 19:50:06,229 INF program: Installing for i386-pc platform.                                                                                                                                                                                                                            
699 19:50:06,229 INF program: grub2-install: error: unknown filesystem.                                                                                                                                                                                                                   
700 19:50:06,229 DBG program: Return code: 1                                                                                                                                                                                                                                              
701 19:50:06,231 INF program: Running in chroot '/mnt/sysimage/ostree/deploy/redhat-coreos-maipo/deploy/94b6298fd8a579b48a0a3beec933194d069371801c3c2ee859791a3bdf430574.0'... grub2-mkconfig -o /boot/grub2/grub.cfg                                                                     
702 19:50:06,554 INF program: Generating grub configuration file ...                                                                                                                                                                                                                      
703 19:50:06,555 INF program: /usr/sbin/grub2-probe: error: unknown filesystem. 
dustymabe commented 5 years ago

So switching the cosa container to rhel7 doesn't help, the installer also needs to be switched.

I think we planned to do that anyway - which is why we added https://github.com/coreos/coreos-assembler/commit/19eb525a43fb9fb9ba0717a23ad7fc533271da26

dustymabe commented 5 years ago

so summary of workarounds:

miabbott commented 5 years ago

so summary of workarounds:

We need to propagate this to our internal repo

cgwalters commented 5 years ago

So I played around a bit yesterday with trying to convince F29 Anaconda to mkfs.xfs a filesystem mountable by a RHEL7 kernel and couldn't figure it out. It's probably not really worth trying to do.

For now I think we tell people who want to build older OSes (CentOS/RHEL7) to override the installer ISO.

Long term, with not-anaconda we'll end up using the tools from the container always. I think the most sustainable path is probably going to be detecting the target content major and adjusting the tooling to work with it. (e.g. if we detect the target content is centos7, then we drop down our mkfs.xfs invocations; if we can't do that with newer mkfs.xfs then we could even bundle the older version too).

dustymabe commented 5 years ago

So I played around a bit yesterday with trying to convince F29 Anaconda to mkfs.xfs a filesystem mountable by a RHEL7 kernel and couldn't figure it out. It's probably not really worth trying to do.

For now I think we tell people who want to build older OSes (CentOS/RHEL7) to override the installer ISO.

👍

Long term, with not-anaconda we'll end up using the tools from the container always. I think the most sustainable path is probably going to be detecting the target content major and adjusting the tooling to work with it. (e.g. if we detect the target content is centos7, then we drop down our mkfs.xfs invocations; if we can't do that with newer mkfs.xfs then we could even bundle the older version too).

That will be handled by using a COSA that matches the target, right?

dustymabe commented 5 years ago

Closing since nothing to do here.

cgwalters commented 5 years ago

That will be handled by using a COSA that matches the target, right?

That's one possibility, but in the text you quoted I also outlined a possible different approach where we support downgrading when necessary.

In practice though...I really hope that this problem solves itself as we primarily target newer hosts going forward forever.

dustymabe commented 5 years ago

as we primarily target newer hosts going forward forever.

👍 👍 👍 👍 👍 👍 👍 👍