clearlinux / distribution

Placeholder repository to allow filing of general bugs/issues/etc against the Clear Linux OS for Intel Architecture linux distribution
520 stars 29 forks source link

"File that should be deleted: /root" when updating to 41340 ? #3083

Open lars-nvhgroup opened 5 months ago

lars-nvhgroup commented 5 months ago

When updating from 41300 to 41340, swupd wants to delete the /root folder, why? -> File that should be deleted: /root -> not deleted (not empty)

This is maybe what happened to one of my servers, sometime during the last 24h all of the files in /root/ got deleted, so it became unreachable by ssh (/root/.ssh/ got deleted) ?

fenrus75 commented 5 months ago

we're investigating and are out of precaution stopping the 340 roll out

On Wed, Mar 27, 2024 at 3:18 AM Lars Hansson @.***> wrote:

When updating from 41300 to 41340, swupd wants to delete the /root folder, why? -> File that should be deleted: /root -> not deleted (not empty)

This is maybe what happened to one of my servers, sometime during the last 24h all of the files in /root/ got deleted, so it became unreachable by ssh (/root/.ssh/ got deleted) ?

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3083, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FOO7WSNB6PEBXL54WLY2KFFTAVCNFSM6AAAAABFKSIUC6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYTAMZZGY3TCNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

fenrus75 commented 5 months ago

we're pushing a new release out now (after having stopped the 340 rollout) with this fixed next is to go through root cause analysis and find all the places that should have prevented this and didn't -- and turn that into action/changes in those places

On Wed, Mar 27, 2024 at 9:51 AM Mario Roy @.***> wrote:

I tried to understand the contents of the /root folder wiped away. The /root entry exists in filesystem-3.0.14-233.x86_64.rpm. Removed in the 235 release, and put back in the 237 release.

41300: $ rpm -qlp filesystem-3.0.14-233.x86_64.rpm | grep ^/root /root 41340: $ rpm -qlp filesystem-3.0.14-235.x86_64.rpm | grep ^/root 41350: $ rpm -qlp filesystem-3.0.14-237.x86_64.rpm | grep ^/root /root

I restored /root/.vimrc

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3083#issuecomment-2023294750, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FOZN6ZHT5VNEO6BK6TY2L2JFAVCNFSM6AAAAABFKSIUC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRTGI4TINZVGA . You are receiving this because you commented.Message ID: @.***>

neben commented 5 months ago

My /root dir content was also deleted. @lars-whta where did you see that message? I only have the following for this update:

Mar 27 05:05:23 nuc systemd[1]: Starting swupd-update.service...
Mar 27 05:05:23 nuc swupd[2525375]: Update started
Mar 27 05:05:25 nuc swupd[2525375]: [49B blob data]
Mar 27 05:05:29 nuc swupd[2525375]: [189B blob data]
Mar 27 05:05:29 nuc swupd[2525375]:  - Sphinx
Mar 27 05:05:29 nuc swupd[2525375]:  - binutils
Mar 27 05:05:29 nuc swupd[2525375]:  - c-basic
Mar 27 05:05:29 nuc swupd[2525375]:  - cloud-api
Mar 27 05:05:29 nuc swupd[2525375]:  - desktop-gnomelibs
Mar 27 05:05:29 nuc swupd[2525375]:  - dnf
Mar 27 05:05:29 nuc swupd[2525375]:  - file-roller
Mar 27 05:05:29 nuc swupd[2525375]:  - gedit
Mar 27 05:05:29 nuc swupd[2525375]:  - gimp
Mar 27 05:05:29 nuc swupd[2525375]:  - gjs
Mar 27 05:05:29 nuc swupd[2525375]:  - gnupg
Mar 27 05:05:29 nuc swupd[2525375]:  - gstreamer
Mar 27 05:05:29 nuc swupd[2525375]:  - gvim
Mar 27 05:05:29 nuc swupd[2525375]:  - hardware-printing
Mar 27 05:05:29 nuc swupd[2525375]:  - harfbuzz-lib
Mar 27 05:05:29 nuc swupd[2525375]:  - kde-frameworks5
Mar 27 05:05:29 nuc swupd[2525375]:  - lib-opengl
Mar 27 05:05:29 nuc swupd[2525375]:  - libX11client
Mar 27 05:05:29 nuc swupd[2525375]:  - os-core
Mar 27 05:05:29 nuc swupd[2525375]:  - os-core-plus
Mar 27 05:05:29 nuc swupd[2525375]:  - os-core-update
Mar 27 05:05:29 nuc swupd[2525375]:  - package-utils
Mar 27 05:05:29 nuc swupd[2525375]:  - perl-basic
Mar 27 05:05:29 nuc swupd[2525375]:  - polkit
Mar 27 05:05:29 nuc swupd[2525375]:  - pulseaudio
Mar 27 05:05:29 nuc swupd[2525375]:  - python-extras
Mar 27 05:05:29 nuc swupd[2525375]:  - sysadmin-basic
Mar 27 05:05:29 nuc swupd[2525375]:  - vim
Mar 27 05:05:29 nuc swupd[2525375]:  - webkitgtk
Mar 27 05:05:29 nuc swupd[2525375]:  - x11-server
Mar 27 05:05:40 nuc swupd[2525375]: Finishing packs extraction...
Mar 27 05:06:01 nuc swupd[2525375]: Statistics for going from version 41300 to version 41340:
Mar 27 05:06:01 nuc swupd[2525375]:     changed bundles   : 30
Mar 27 05:06:01 nuc swupd[2525375]:     new bundles       : 0
Mar 27 05:06:01 nuc swupd[2525375]:     deleted bundles   : 0
Mar 27 05:06:01 nuc swupd[2525375]:     changed files     : 1385
Mar 27 05:06:01 nuc swupd[2525375]:     new files         : 4292
Mar 27 05:06:01 nuc swupd[2525375]:     deleted files     : 3468
Mar 27 05:06:01 nuc swupd[2525375]: Validate downloaded files
Mar 27 05:06:05 nuc swupd[2525375]: No extra files need to be downloaded
Mar 27 05:06:05 nuc swupd[2525375]: Installing files...
Mar 27 05:06:07 nuc swupd[2525375]: Update was applied
Mar 27 05:06:07 nuc swupd[2525375]: Calling post-update helper scripts
Mar 27 05:06:07 nuc swupd[2525375]: External command: none
Mar 27 05:06:07 nuc swupd[2525375]: External command: pacdiscovery.service: restarted (the binary was updated)
Mar 27 05:06:07 nuc swupd[2525375]: External command: tallow.service: restarted (the binary was updated)
Mar 27 05:06:07 nuc swupd[2525375]: External command: pacrunner.service: restarted (the binary was updated)
Mar 27 05:06:07 nuc swupd[2525375]: External command: systemd-journald.service: restarted (the binary was updated)
Mar 27 05:06:08 nuc swupd[2525375]: External command: systemd-resolved.service: restarted (the binary was updated)
Mar 27 05:06:08 nuc swupd[2525375]: External command: systemd-timesyncd.service: restarted (the binary was updated)
Mar 27 05:06:57 nuc swupd[2525375]: Update took 94.0 seconds, 271 MB transferred
Mar 27 05:06:57 nuc swupd[2525375]: Update successful - System updated from version 41300 to version 41340
Mar 27 05:06:57 nuc systemd[1]: swupd-update.service: Deactivated successfully.
Mar 27 05:06:57 nuc systemd[1]: Finished swupd-update.service.
victorstewart commented 5 months ago

can't stress enough how important it is for this to never happen again

bwarden commented 5 months ago

We've taken actions to block this release and to prevent a recurrence, while we investigate deeper how the problem made it through various quality checks.

We're deeply sorry for the issues this caused, and are actively bolstering code paths and test procedures at multiple levels to prevent this or similar issues from happening again.

Background: I made a change a few weeks ago to remove an extraneous entry for /root from the configuration used by systemd-tmpfiles to create and maintain various directories on the platform. It should have been harmless, but instead exposed some weak points in our development and release processes that culminated with deleting the /root directory from updated systems.

We are putting the following measures and failsafes in place to ensure this and a multitude of similar issues are prevented in the future:

lars-nvhgroup commented 5 months ago

We've taken actions to block this release and to prevent a recurrence, while we investigate deeper how the problem made it through various quality checks.

We're deeply sorry for the issues this caused, and are actively bolstering code paths and test procedures at multiple levels to prevent this or similar issues from happening again.

Background:

I made a change a few weeks ago to remove an extraneous entry for /root from the configuration used by systemd-tmpfiles to create and maintain various directories on the platform. It should have been harmless, but instead exposed some weak points in our development and release processes that culminated with deleting the /root directory from updated systems.

We are putting the following measures and failsafes in place to ensure this and a multitude of similar issues are prevented in the future:

  • We've already automated tests to catch this type of failure very early in our development process (creating the base files), prior to generating content for even an update candidate.

  • We're currently adding more complex tests for the update content itself (manifests), to fail on an attempt to delete protected content.

  • We identified code in swupd itself that was supposed to prevent touching files outside of /usr, but wasn’t quite correct, and have fixed it and added more specific protections.

  • We're reworking how swupd update works with files it doesn't know about. Specifically it will no longer remove unknown files or directories. Only swupd repair --extra-files-only will remove files that swupd doesn't know about (with documented caveats). As always, the source code for swupd-client is available for your review.

  • And finally, we're improving our ability to block a release that’s already in the wild if we find out there's a serious problem that slipped through.

Thanks for the rundown! Good it got caught early, but it did cause me 15 min of headscratching 🙃

marioroy commented 5 months ago

Will the safety measures outlined above safeguard the filesystem package? For example, will QA catch the /root entry being removed from the file list section in spec file?

Thanks for making Clear Linux reliable.

fenrus75 commented 5 months ago

each of these steps will catch this separately.... (yes it is a bit redundant but so be it)

On Thu, Mar 28, 2024 at 1:44 PM Mario Roy @.***> wrote:

Will the safety measures outlined above safeguard the filesystem package? For example, will QA catch the '/root` entry being removed from the file list section in spec file?

Thanks for making Clear Linux reliable.

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3083#issuecomment-2026082692, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FIGSL4YEO4LFOROQADY2R6JPAVCNFSM6AAAAABFKSIUC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRWGA4DENRZGI . You are receiving this because you commented.Message ID: @.***>