Closed sphh closed 4 years ago
Can you post the entries in the boot directory (i.e. run ls -al /boot
)?
I already deleted the files by hand to save some space in the /boot
partition, but here it is anyway:
$ ls /boot/
config-4.15.0-106-generic
config-4.19.123-surface-lts
config-5.6.15-surface
efi/
grub/
initrd.img-4.15.0-106-generic
initrd.img-4.19.123-surface-lts
initrd.img-5.6.15-surface
lost+found/
memtest86+.bin
memtest86+.elf
memtest86+_multiboot.bin
System.map-4.15.0-106-generic
System.map-4.19.123-surface-lts
System.map-5.6.15-surface
vmlinuz-4.15.0-106-generic
vmlinuz-4.19.123-surface-lts
vmlinuz-5.6.15-surface
I can get a new list after the next round of surface kernel updates.
I cannot reproduce this in a LXC container running Ubuntu 19.10. The postrm
script of the package calls the hooks in /etc/kernel/postrm.d
with, as far as I can tell, the correct version, which should then take care of updating/removing the initrd, at least on Ubuntu and Debian.
Which distribution are you running, specifically?
initrd.img-5.6.15-surface
is exactly the same name I see in my container so it doesn't seem to be any difference there.
I run Linux Mint 19.3 Cinnamon.
The "problem" was, that after an kernel update from - let's say 5.6.14 to 5.6.15 - there were these files in /boot
(I hope I reconstructed it correctly):
config-5.6.15-surface
initrd.img-5.6.14-surface
initrd.img-5.6.15-surface
System.map-5.6.15-surface
vmlinuz-5.6.15-surface
This seems to be a problem with the /etc/kernel/postrm.d/initramfs-tools
script, updating the initramfs after package removal/upgrade. This contains
if [ -z "$1" ] || [ "$1" != "remove" ]; then
exit 0
fi
which returns if the package is not removed, e.g. is upgraded. it seems that Debian/Ubuntu expect all kernel packages to be specifically named with their version in the name. That way the package does not really get upgraded, but removed and a completely new package (with different name) gets installed.
There's nothing we can do to change the /etc/kernel/postrm.d/initramfs-tools
script, since that's not part of the kernel package. To replicate the behavior of original Debian/Ubuntu packages, we'd have to switch to meta-packages (https://github.com/linux-surface/repo/issues/2).
Would it be possible to remove the stale initrd.img-xxx
file with the help of package maintainer scripts? A lot of time has passed since I was responsible for a Debian package, that I can't remember all the details...
Possible yes, although that would bypass the initrd tools and I'm not sure if that is a good idea. I think that a better quick-fix would probably be modifying https://github.com/torvalds/linux/blob/b3a9e3b9622ae10064826dccb4f7a52bd88c7407/scripts/package/builddeb#L193 for postrm by replacing upgrade
with remove
(first parameter to the script). The best solution would be to switch to meta-packages, but doing that will cost more time.
Can you post the entries in the boot directory (i.e. run
ls -al /boot
)?I already deleted the files by hand to save some space in the
/boot
partition
Here is my output, where I didn't delete any files:
drwxr-xr-x 4 root root 4096 Jun 12 13:38 .
drwxr-xr-x 18 root root 4096 Mar 15 14:43 ..
-rw-r--r-- 1 root root 241743 Jun 1 05:39 config-5.6.15-surface
-rw-r--r-- 1 root root 243596 Jun 12 11:02 config-5.7.0-daniel1
drwx------ 4 root root 4096 Dec 31 1969 efi
drwxr-xr-x 5 root root 4096 Jun 12 13:38 grub
-rw-r--r-- 1 root root 82644751 Mar 21 11:57 initrd.img-5.5.10-surface
-rw-r--r-- 1 root root 82672122 Apr 1 15:33 initrd.img-5.5.13-surface
-rw-r--r-- 1 root root 82116124 Mar 15 14:39 initrd.img-5.5.8-surface
-rw-r--r-- 1 root root 83634384 May 25 17:31 initrd.img-5.6.14-surface
-rw-r--r-- 1 root root 83767267 Jun 12 13:36 initrd.img-5.6.15-surface
-rw-r--r-- 1 root root 83566986 Apr 20 20:25 initrd.img-5.6.5-surface
-rw-r--r-- 1 root root 83631541 May 25 17:29 initrd.img-5.6.7-surface
-rw-r--r-- 1 root root 84109809 Jun 12 13:38 initrd.img-5.7.0-daniel1
-rw-r--r-- 1 root root 182704 Feb 13 18:09 memtest86+.bin
-rw-r--r-- 1 root root 184380 Feb 13 18:09 memtest86+.elf
-rw-r--r-- 1 root root 184884 Feb 13 18:09 memtest86+_multiboot.bin
-rw-r--r-- 1 root root 5346756 Jun 1 05:39 System.map-5.6.15-surface
-rw-r--r-- 1 root root 5403608 Jun 12 11:02 System.map-5.7.0-daniel1
-rw-r--r-- 1 root root 11888888 Jun 1 05:39 vmlinuz-5.6.15-surface
-rw-r--r-- 1 root root 11856032 Jun 12 11:02 vmlinuz-5.7.0-daniel1
Over the course of the past few months, the problem has grown to affect me more seriously. I now have 11 of them in /boot
and there is not a single byte of disk space left. This caused APT to fail with the following error today:
Processing triggers for initramfs-tools (0.136ubuntu6.2) ...
update-initramfs: Generating /boot/initrd.img-5.7.6-surface
Error 24 : Write error : cannot write compressed block
E: mkinitramfs failure cpio 141 lz4 -9 -l 24
update-initramfs: failed for /boot/initrd.img-5.7.6-surface with 1.
dpkg: error processing package initramfs-tools (--configure):
installed initramfs-tools package post-installation script subprocess returned error exit status 1
Processing triggers for libc-bin (2.31-0ubuntu9) ...
Errors were encountered while processing:
linux-firmware
initramfs-tools
E: Sub-process /usr/bin/dpkg returned an error code (1)
While I am aware of this bug and that I need to apply the workaround of deleting those files, others might not be. They might be confused and wonder why initramfs-tools suddenly started failing.
I'll try to have a go at it this weekend.
@danielzgtg Okay, so I'm trying to figure out how Debian/Ubuntu implement the kernel metapackages. Specifically how they remove old kernels (they do that, right?). If they do that, any idea how?
Old kernels are removed by being "no longer required." They are removed with apt autoremove
. That forms part of the apt update && apt upgrade && apt dist-upgrade && apt autoremove --purge
I do.
The latest kernel is always depended upon by the metapackage. The second latest kernel is not depended on, but somehow isn't marked "no longer required." Earlier kernels are always marked as "no longer required." The command that updates the metapackage does not remove the old kernel; that is done manually in a separate command.
Thanks! That does clear things up for me. I just wanted to be certain that there isn't any "magic" behavior just for kernel packages that removes old ones on apt upgrade
automatically. So it should be enough to just specify a dependency.
@danielzgtg Can you test if upgrading to the latest version works without any problems?
Edit: Just noticed I screwed up the LTS release. Can you try updating the non-LTS release? That should work. Edit 2: The LTS release should now also work.
@qzed, thanks for looking into this and making meta-packages!
Today I wanted to install the newly available packages, but I get the following errors. I don't know if this is because of this metapackage thing...
E: /tmp/apt-dpkg-install-DbU0wE/0-linux-headers-4.19.131-surface-lts_4.19.131-surface-lts-3_amd64.deb: trying to overwrite '/usr/src/linux-headers-4.19.131-surface-lts/.config', which is also in package linux-headers-surface-lts 4.19.131-1
E: /tmp/apt-dpkg-install-DbU0wE/1-linux-headers-5.7.7-surface_5.7.7-surface-2_amd64.deb: trying to overwrite '/lib/modules/5.7.7-surface/build', which is also in package linux-headers-surface 5.7.7-1
E: /tmp/apt-dpkg-install-DbU0wE/4-linux-image-4.19.131-surface-lts_4.19.131-surface-lts-3_amd64.deb: trying to overwrite '/boot/System.map-4.19.131-surface-lts', which is also in package linux-image-surface-lts 4.19.131-1
E: /tmp/apt-dpkg-install-DbU0wE/5-linux-image-5.7.7-surface_5.7.7-surface-2_amd64.deb: trying to overwrite '/boot/System.map-5.7.7-surface', which is also in package linux-image-surface 5.7.7-1
I have both the linux-surface
and linux-surface-lts
installed...
Yeah, that's caused by the change, thanks for the feedback! This means that to upgrade the packages, you'll have to manually uninstall linux-surface and linux-surace-lts and then re-install them.
Basically, what happens is (as far as I can see it): The linux-surface-lts
package gets marked for an update. With the change, it now depends on linux-image-<...>
, so it wants to install that first. Installing that fails however, because the old linux-surface-lts
package has files that are in the newer version part of the linux-image-<...>
package, which that package would then overwrite. The files are still there, because the package gets upgraded only after the dependencies are installed.
I managed to recover from this error with:
sudo apt-get --fix-broken install
Old kernels are removed by being "no longer required." They are removed with
apt autoremove
. That forms part of theapt update && apt upgrade && apt dist-upgrade && apt autoremove --purge
I do.
In Ubuntu, running autoremove or purging is not necessary as of the last four or more months. When the software updater window appears it prompts to remove old kernels. I haven't had to run either of those commands, for the purpose of removing old Ubuntu kernels, for months.
Going forward, does this mean you have to uninstall the current linux-surface kernel before you can install the latest version or it'll error? Or it's a one-off as a result of the changes listed in #96?
Edited - Added images
Am I pointing out something relevant here or am I pointing out the obvious! Do say!
In Ubuntu, running autoremove or purging is not necessary as of the last four or more months.
Intersting, thanks for sharing! Do you have any idea how the kernel packages are identified and marked for removal? Can you have an eye on this over the next couple of linux-surface kernel releases to check if that's working correctly/how that behaves?
Going forward, does this mean you have to uninstall the current linux-surface kernel before you can install the latest version or it'll error? Or it's a one-off as a result of the changes listed in #96?
This should be a one-off for this upgrade only, although I haven't been able to test that, so would be neat to have some feedback on that over the next couple of releases. I've also updated the announcement to make this a bit clearer.
Can you test if upgrading to the latest version works without any problems?
Boots fine. initrd.img-5.7.6-surface
is still left in /boot/
, but it should be the last one that needs to be removed manually. Once 5.7.8/5.8 comes out, the removal of initrd.img-5.7.7-surface
can be tested.
I also encountered the thing with:
I managed to recover from this error with [...]
When the software updater window appears it prompts to remove old kernels
I don't use the GUI :)
one-off for this upgrade only
This could be avoided even for the first time with a Conflicts
line.
Do you have any idea how the kernel packages are identified and marked for removal?
I'm not sure how I would find that out, I spent some time googling related terms but couldn't find anything useful. Here is the [1]dpkg log when the kernels were being removed, here is the journalctl [2]log when the preceding update was in progress, installing new kernels via the Software Updater and here is a [3]log of the Software Updater removing kernels. If you see any other files mentioned in these links, you think might help tell me and I'll have a look at posting.
[1]https://github.com/condemnedmeat/dot/blob/master/dpkg.txt [2]https://github.com/condemnedmeat/dot/blob/master/loga.txt [3]https://github.com/condemnedmeat/dot/blob/master/logb.txt
Or perhaps the answer is somewhere here...
/etc/kernel/postinst.d: apt-auto-removal initramfs-tools unattended-upgrades update-notifier zz-update-grub
/etc/kernel/postrm.d: initramfs-tools zz-update-grub
The update went well
Neat!
This could be avoided even for the first time with a
Conflicts
line.
Ah thanks! I'll keep that in mind for the next time. On the other hand, the failure gets users to (hopefully) look at the announcements and makes them aware of the outdated initrd files, so I think it might not be that bad of a thing.
I also encountered the thing with:
I managed to recover from this error with [...]
Thanks! I've updated the announcement and added the apt-get --fix-broken install
command.
Or perhaps the answer is somewhere here...
/etc/kernel/postinst.d: apt-auto-removal
I think that could be it. In that case it should already work (since the initramfs-tools stuff works). Guess the only way to know for sure is to check the behavior over the next couple of releases.
I got these errors on the very next update.
It's recreated the initrd files that had been deleted.
It somewhat looks like there are still remnants of old kernels around. As I don't use Ubuntu, I have no clue where, or what would cause the initrds to be rebuilt.
The solution may be to sudo rm
the equivalent files in /var/lib/initramfs-tools as apparently that is where initramfs refers to. I'll see what happens when I next update.
I'm not sure if you only need to remove those files or if you must remove the files in the boot partition too.
The source of my information was: http://www.bigsoft.co.uk/blog/2018/05/13/ubuntu-boot-partition-full https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1773476/comments/6
No depmod issues and the previous kernel is automatically removed during the update so all works on 18.04 & 20.04.
Neat! Closing this as it seems to be resolved. Feel free to comment/re-open if there are still issues remaining.
It worked today and automatically removed the third last kernel. It also magically preserved the second last kernel.
Currently the old kernels are not being removed at the time the new kernel is being installed when using Software Updater. It is listed to be removed next time the Updater is used. Perhaps I reported it as working incorrectly in July, or this is intended behaviour. This has been the case since debian_lts-4.19.134 or so.
I guess as long as apt autoremove
removes them (except for maybe the latest N versions), it works as intended. Let me know if apt autoremove
doesn't work.
I should think that works as it is meant to.
When updating the Debian package (both 4.19-lts and 5.6), the file
initrd.img-xxx
from the old package stays behind. I just found 5 of those outdated files.Is there a chance to include the removal in the Debian packages?