Closed jedahan closed 3 years ago
I cannot reproduce this issue at all (with any libraries sway links to nor with other packages in a similar scenario).
Now, I had a similar issue a long while back (updating mesa would kill sway) and fixed it
by enabling CONFIG_CHECKPOINT_RESTORE=y
in my kernel. Not sure if this is related
to this present issue, just thought I'd mention it. (NOTE: this is unique to amdgpu).
Whatever the fix ends up being for this issue - it must remain portable and if at all
possible POSIX compliant. I'm not opposed to using install
, I'd just like to do some
research and see what our options are as to other possible remedies.
Let me know if this fixes the issue, if not I'll try other approaches I have in mind.
Somewhat relevant: In my blog post I mentioned replacing components of the
package manager with C (in an opt-in, unintrusive manner of course). One of the
first items on my list is a pkg_install_files
equivalent in C as installation is one of
the slowest parts of the package manager (for medium to large packages).
Small change on my own end with the latest master. Instead of my computer just completely locking up and requiring a forced restart, wayfire
merely segfaults when it or its dependencies are installed.
So... Progress...
Interesting. I still wonder why I can't reproduce this at all.
Will work on this some more. Thanks for testing.
Trying to come up with a minimal testcase is hard. Maybe writing a program that calls a library function once every 5 seconds, then changing the name of the function and replacing the library while the program is running? I wonder if the install
package (busybox or coreutils) has a test like this.
I've been playing around with using install
and it's still not perfect as we have to fallback to cp -f
as install
aggressively resolves symbolic links. Here's a patch you can try.
diff --git a/kiss b/kiss
index 11c6b83..7bd4f23 100755
--- a/kiss
+++ b/kiss
@@ -1085,27 +1085,11 @@ pkg_install_files() {
# Skip directories as they're likely symlinks in this case.
# Pure directories in manifests have a suffix of '/'.
[ -d "$_file" ] || test "$1" "$_file" || {
- _file_tmp=$2/${file#/}
-
- # Copy the file to a temporary location so that later
- # verification (after pkg_remove_files()) is possible.
- #
- # This duplication only occurs during the first
- # invocation of pkg_install_files() - the second call
- # will simply do the 'mv' call.
- case $1 in -z)
- cp -fP "$2/${file#/}" "$2/__f-$pid"
- _file_tmp=$2/__f-$pid
- esac
-
- # Atomically move the temporary file to the root
- # filesystem. The running processes will either get
- # the old file or the new one.
- mv -f "$_file_tmp" "$_file"
-
- # Skip changing permissions of symlinks. This prevents
- # errors when the symlink exists prior to the target.
- [ -h "$_file" ] || chmod "$b$oct" "$_file"
+ if [ -h "$2$file" ]; then
+ cp -fP "$2$file" "$_file"
+ else
+ install -m "$b$oct" "$2$file" "$_file"
+ fi
}
esac
done || :
Please try this patch. The original fix is wrong as the atomic operation can't occur across filesystem boundaries. This instead copies to the destination directory and moves it directly from there.
diff --git a/kiss b/kiss
index 11c6b83..394f613 100755
--- a/kiss
+++ b/kiss
@@ -1085,7 +1085,8 @@ pkg_install_files() {
# Skip directories as they're likely symlinks in this case.
# Pure directories in manifests have a suffix of '/'.
[ -d "$_file" ] || test "$1" "$_file" || {
- _file_tmp=$2/${file#/}
+ _file_tmp=$2$file
+ _parent_dest=${_file%/*}
# Copy the file to a temporary location so that later
# verification (after pkg_remove_files()) is possible.
@@ -1094,8 +1095,8 @@ pkg_install_files() {
# invocation of pkg_install_files() - the second call
# will simply do the 'mv' call.
case $1 in -z)
- cp -fP "$2/${file#/}" "$2/__f-$pid"
- _file_tmp=$2/__f-$pid
+ _file_tmp=$_parent_dest/__tmp_${file##*/}-$pid
+ cp -fP "$2$file" "$_file_tmp"
esac
# Atomically move the temporary file to the root
NOTE: Regardless of whatever the final fix is, this entire process needs to be made more robust.
Both solutions worked for me, tested by upgrading sway and it’s dependencies while running. Thank you!
Pushed the fix. Please try and it let me know. :)
If we want to fix the error case (dirty filesystem). What we want is something transactional. This commit just adds atomicity to each file transfer. To make this transactional we'd be pushing the limits of the shell further (and slowing down install even more in the process).
Transactional install would require making copies of all the previous version's files in $KISS_ROOT
(this can happen before or during each transfer), doing the install and then if there are errors, copying all the previous version's files back to the filesystem (while also deleting any files installed from the new version).
This is better suited for the C implementation of the component although it should be trivial to print out the state of the filesystem alongside the error message for the shell implementation.
I was playing around with writing a similar idea out
pre-merge: tar up backup manifest (+ checksums?) merge: as is trap err in merge: untar backup manifest (+ report checksums)
Can I get confirmation that master still works? I had to fix #232 and this may have regressed the fix for this issue.
Latest git master appears to have resolved this problem!
Tested latest master by removing many firefox dependencies while writing this comment. Seems solid.
As an example, if I
kiss build cairo && kiss install cairo
, sway will hard-lock.I was able to replace
cp -fP
with (busybox)install
inpkg_install_files
on https://github.com/kisslinux/kiss/blob/master/kiss#L1088 which fixed the issue.Not sure if there is a way to fix not using install, but was nice to be able to
kiss update
without having to hard reboot my laptop.