OSInside / kiwi

KIWI - Appliance Builder Next Generation
https://osinside.github.io/kiwi
GNU General Public License v3.0
301 stars 152 forks source link

Syncing root filesystem data - the process hangs forever #1062

Closed jpeter01 closed 5 years ago

jpeter01 commented 5 years ago

Problem description

I try to build an oem image, and at a certain point the build process keeps hanging.

Expected behaviour

It should finish the build process without any problem.

Steps to reproduce the behaviour

Start a kiwi-ng build: kiwi-ng-3 --color-output --type oem system build --target-dir /srv/kiwi/this_build-4.0.4 --description . And if you let it run for 2 days long, it won't finish. The process should normally take less than 2 hours.

OS and Software information

additional info

There is no problem with free space, or RAM, the kiwi process hangs at rsync. Kiwi output: [ INFO ]: 10:44:08 | --> Syncing EFI boot data to EFI partition [ WARNING ]: 10:44:08 | Extended attributes not supported for target: /tmp/kiwi_mount_manager._l_wx67x [ INFO ]: 10:44:08 | --> Syncing boot data at extra partition [ INFO ]: 10:44:09 | --> Syncing root filesystem data

Kiwi log: INFO: 10:44:08 | --> GRUB_THEME:/boot/grub2/themes/openSUSE/theme.txt INFO: 10:44:08 | --> GRUB_TIMEOUT:10 INFO: 10:44:08 | --> GRUB_USE_INITRDEFI:true INFO: 10:44:08 | --> GRUB_USE_LINUXEFI:true INFO: 10:44:08 | Writing sysconfig bootloader file INFO: 10:44:08 | --> DEFAULT_APPEND:"splash lang=hu_HU nomodeset root=UUID=c543cb96-87fd-480e-9ab2-eda8f9901b9f rw rd.auto" INFO: 10:44:08 | --> FAILSAFE_APPEND:"splash lang=hu_HU nomodeset root=UUID=c543cb96-87fd-480e-9ab2-eda8f9901b9f rw rd.auto ide=nodma apm=off noresume edd=off nomodeset 3 rd.auto" INFO: 10:44:08 | --> LOADER_LOCATION:none INFO: 10:44:08 | --> LOADER_TYPE:grub2-efi INFO: 10:44:08 | Creating config.bootoptions INFO: 10:44:08 | Syncing system to image INFO: 10:44:08 | --> Syncing EFI boot data to EFI partition DEBUG: 10:44:08 | EXEC: [mountpoint -q /tmp/kiwi_mount_manager._l_wx67x] DEBUG: 10:44:08 | EXEC: [mount /dev/mapper/loop0p2 /tmp/kiwi_mount_manager._l_wx67x] WARNING: 10:44:08 | Extended attributes not supported for target: /tmp/kiwi_mount_manager._l_wx67x DEBUG: 10:44:08 | EXEC: [rsync -a -H --one-file-system /srv/kiwi/huedu-4.0.4/build/image-root/boot/efi/ /tmp/kiwi_mount_manager._l_wx67x] DEBUG: 10:44:08 | EXEC: [mountpoint -q /tmp/kiwi_mount_manager._l_wx67x] DEBUG: 10:44:08 | EXEC: [umount /tmp/kiwi_mount_manager._l_wx67x] INFO: 10:44:08 | --> Syncing boot data at extra partition DEBUG: 10:44:08 | EXEC: [mountpoint -q /tmp/kiwi_mountmanager.5mk1dyi] DEBUG: 10:44:08 | EXEC: [mount /dev/mapper/loop0p3 /tmp/kiwi_mountmanager.5mk1dyi] DEBUG: 10:44:08 | EXEC: [rsync -a -H -X -A --one-file-system --exclude /efi/ /srv/kiwi/huedu-4.0.4/build/image-root/boot/ /tmp/kiwi_mountmanager.5mk1dyi] DEBUG: 10:44:08 | EXEC: [mountpoint -q /tmp/kiwi_mountmanager.5mk1dyi] DEBUG: 10:44:08 | EXEC: [umount /tmp/kiwi_mountmanager.5mk1dyi] INFO: 10:44:09 | --> Syncing root filesystem data DEBUG: 10:44:09 | EXEC: [mountpoint -q /tmp/kiwi_volumes.h4rtd9e1] DEBUG: 10:44:09 | EXEC: [rsync -a -H -X -A --one-file-system --exclude /image --exclude /.profile --exclude /.kconfig --exclude /.buildenv --exclude /var/cache/kiwi --exclude /boot/ --exclude /boot/. --exclude /boot/efi/ --exclude /boot/efi/.* /srv/kiwi/huedu-4.0.4/build/image-root/ /tmp/kiwi_volumes.h4rtd9e1]

schaefi commented 5 years ago

I tried the following to reproduce this issue:

$ git clone https://github.com/SUSE/kiwi-descriptions.git
$ cd kiwi-descriptions/suse/x86_64/suse-leap-15.0-JeOS/
$ sudo kiwi-ng-3 --color-output --type oem system build --target-dir /tmp/mytest/ --description .

and it just builds.

Your target dir --target-dir /srv/kiwi/this_build-4.0.4 what is this ? anything special ? NFS mounted such that rsync might have a problem ? Just guessing here. In any case at the time you think it hangs you can try to repeat the rsync call in an extra shell and see if it outputs something that points to a hang

jpeter01 commented 5 years ago

Nothing special, it is a local xfs filesystem. I tried the rsync manually, it show's that the filesystem is full: ...rsync: mkstemp "/tmp/kiwi_volumes.1h37f1hz/srv/www/squidanalyzer/.sorttable.js.jUvJk0" failed: No space left on device (28) rsync: mkstemp "/tmp/kiwi_volumes.1h37f1hz/srv/www/squidanalyzer/.squidanalyzer.css.0Av67U" failed: No space left on device (28)

I have set the the root partition size to 20G, but during the build process, it is smaller. The root FS size is 4.8 GB during the build process, the final ISO size should be ~1.7 GB.

Additional info:

du -sh /srv/kiwi/huedu-4.0.4/build/image-root/
4,8G    /srv/kiwi/huedu-4.0.4/build/image-root/
df -hT
Filesystem                             Type      Size  Used Avail Use% Mounted on
devtmpfs                               devtmpfs  2.0G  8.0K  2.0G   1% /dev
tmpfs                                  tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs                                  tmpfs     2.0G  9.0M  2.0G   1% /run
tmpfs                                  tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/xvda2                             xfs        28G  3.4G   25G  12% /
/dev/mapper/system--disk2--build42-srv xfs       110G   64G   47G  58% /srv
/dev/mapper/vgsystem-LVRoot            ext4      6.0G  5.4G  359M  94% /tmp/kiwi_volumes.1h37f1hz
/dev/mapper/vgsystem-home              ext4      101M   52M   42M  56% /tmp/kiwi_volumes.1h37f1hz/home
/dev/mapper/vgsystem-srv               ext4      496M  382M   79M  83% /tmp/kiwi_volumes.1h37f1hz/srv
/dev/mapper/vgsystem-var               ext4       66M   30M   32M  49% /tmp/kiwi_volumes.1h37f1hz/var
tmpfs                                  tmpfs     393M     0  393M   0% /run/user/0

config.xml (disk):

       <type image="oem" filesystem="ext4" initrd_system="dracut" installiso="true" bootloader="grub2" kernelcmdline="splash lang=hu_HU nomodeset" mdraid="mirroring" firmware="efi">
            <oemconfig>
                <oem-systemsize>40960</oem-systemsize>
                <oem-swap>true</oem-swap>
                <oem-swapsize>2048</oem-swapsize>
                <oem-device-filter>/dev/ram</oem-device-filter>
                <oem-multipath-scan>false</oem-multipath-scan>
            </oemconfig>
<!--            
            <machine memory="512" guestOS="suse" HWversion="4">
                <vmdisk id="0" controller="ide"/>
                <vmnic driver="e1000" interface="0" mode="bridged"/>
            </machine>
-->
            <systemdisk name="vgsystem" preferlvm="true">
<!--            <systemdisk name="vgsystem">    -->
             <volume name="srv" freespace="6000M"/>
             <volume name="var" freespace="6000M"/>
             <volume name="home" freespace="6000M"/>
             <volume name="@root" size="20000M"/>
<!--              <volume name="@root" freespace="10000M"/> -->
            </systemdisk>
        </type>
    </preferences>
schaefi commented 5 years ago

If it runs into a no space left issue it usually does not hang but just returns from rsync with an error. This is very strange. I'm still thinking it might be a conflict with lvm. /dev/mapper/system--disk2--build42-srv is that also living in a logical volume ?

Building an lvm based image on a host that uses lvm for itself has caused problems in the past. But the reported hang would be a new thing. What does vgdisplay and lvdisplay show on your system ?

jpeter01 commented 5 years ago

Yes /dev/mapper/system--disk2--build42-srv is on an LV. If I run the build process, the LV-s form the build are also showing up (if the build process isn't running, than only the system-disk2-build42 VG, and the /dev/system-disk2-build42/srv LV are listed):

vgdisplay 
  --- Volume group ---
  VG Name               system-disk2-build42
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  9
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               110,00 GiB
  PE Size               4,00 MiB
  Total PE              28159
  Alloc PE / Size       28159 / 110,00 GiB
  Free  PE / Size       0 / 0   
  VG UUID               Y6WPdL-Grw8-aVdi-01mg-mpl1-6c31-7NB65q

  --- Volume group ---
  VG Name               vgsystem
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                4
  Open LV               4
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               6,87 GiB
  PE Size               4,00 MiB
  Total PE              1758
  Alloc PE / Size       1746 / 6,82 GiB
  Free  PE / Size       12 / 48,00 MiB
  VG UUID               5gW6ii-knj2-YrG5-XAkL-KbFW-LP8w-UDzV6I
lvdisplay 
  --- Logical volume ---
  LV Path                /dev/system-disk2-build42/srv
  LV Name                srv
  VG Name                system-disk2-build42
  LV UUID                aEVmte-toPu-Bu9a-CcG4-gdh2-PxXq-2lgs3L
  LV Write Access        read/write
  LV Creation host, time build42, 2018-04-12 12:50:47 +0200
  LV Status              available
  # open                 1
  LV Size                110,00 GiB
  Current LE             28159
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:0

  --- Logical volume ---
  LV Path                /dev/vgsystem/home
  LV Name                home
  VG Name                vgsystem
  LV UUID                hVMrhy-0zRj-O9CG-BzNL-Ekxo-vfMx-czAnk1
  LV Write Access        read/write
  LV Creation host, time build15, 2019-04-23 11:09:39 +0200
  LV Status              available
  # open                 1
  LV Size                108,00 MiB
  Current LE             27
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:5

  --- Logical volume ---
  LV Path                /dev/vgsystem/srv
  LV Name                srv
  VG Name                vgsystem
  LV UUID                ndNWMh-Tu0l-wNp4-fckK-34qS-FALC-1vhVFU
  LV Write Access        read/write
  LV Creation host, time build15, 2019-04-23 11:09:39 +0200
  LV Status              available
  # open                 1
  LV Size                520,00 MiB
  Current LE             130
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:6

  --- Logical volume ---
  LV Path                /dev/vgsystem/var
  LV Name                var
  VG Name                vgsystem
  LV UUID                mp2h3U-t7Kw-FCSg-pCdf-wOMi-p2Cz-BHlYrA
  LV Write Access        read/write
  LV Creation host, time build15, 2019-04-23 11:09:40 +0200
  LV Status              available
  # open                 1
  LV Size                72,00 MiB
  Current LE             18
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:7

  --- Logical volume ---
  LV Path                /dev/vgsystem/LVRoot
  LV Name                LVRoot
  VG Name                vgsystem
  LV UUID                hQjiYI-TPKm-55Tn-zbQY-Gu7x-pkLc-gMa7t1
  LV Write Access        read/write
  LV Creation host, time build15, 2019-04-23 11:09:41 +0200
  LV Status              available
  # open                 1
  LV Size                6,14 GiB
  Current LE             1571
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     1024
  Block device           254:8

p.s. There was an earlier kiwi version, it had no problem with the same setup (maybe some of 9.14.x or 9.15.x).

jpeter01 commented 5 years ago

I modified the build system, now it is a system withouth any LV, but the build process still hangs at the rsync phase. If I run it manually: rsync -a -H -X -A --one-file-system --exclude /image --exclude /.profile --exclude /.kconfig --exclude /.buildenv --exclude /var/cache/kiwi --exclude /boot/* --exclude /boot/.* --exclude /boot/efi/* --exclude /boot/efi/.* /srv/kiwi/huedu-4.0.4/build/image-root/ /tmp/kiwi_volumes.td4obmsq I get the same as before:

rsync: mkstemp "/tmp/kiwi_volumes.td4obmsq/srv/www/roundcubemail/public_html/.index.php.v2zYTm" failed: No space left on device (28)
rsync: mkstemp "/tmp/kiwi_volumes.td4obmsq/srv/www/roundcubemail/vendor/.autoload.php.7fUvhA" failed: No space left on device (28)
rsync: mkstemp "/tmp/kiwi_volumes.td4obmsq/srv/www/squidanalyzer/.flotr2.js.dyj3EN" failed: No space left on device (28)
rsync: mkstemp "/tmp/kiwi_volumes.td4obmsq/srv/www/squidanalyzer/.sorttable.js.0u0D20" failed: No space left on device (28)
rsync: mkstemp "/tmp/kiwi_volumes.td4obmsq/srv/www/squidanalyzer/.squidanalyzer.css.wa9eqe" failed: No space left on device (28)

If you need the build sources for debuging, I can share them with you.

jpeter01 commented 5 years ago

Thank for your help! We can close this, it was some naughty FS bug. I tried earlier the xfs_repair -n on the build system, but it didn't showed any errors. Today I deleted the build source directories, rsync-ed them again to the build system, and it doesn't hangs anymore.

schaefi commented 5 years ago

Thanks much for hunting this down. I was spinning my head what in kiwi could have caused this as you mentioned it worked in a former version :)