sodaliterocks / sodalite

🪨 A Pantheon experience for rpm-ostree
https://sodalite.rocks
MIT License
209 stars 6 forks source link

Infinite loop related to migrate.sh script #88

Closed RedBearAK closed 1 year ago

RedBearAK commented 1 year ago

@electricduck

I have a ton of this in the journal, constantly repeating:

Jul 25 04:03:10 localhost.localdomain rocks.sodalite.user-daemon.desktop[36076]: cat: /var/run/rocks.sodalite.hacks/migrate-status: No such file or directory
Jul 25 04:03:11 localhost.localdomain rocks.sodalite.user-daemon.desktop[36082]: cat: /var/run/rocks.sodalite.hacks/migrate-status: No such file or directory
Jul 25 04:03:12 localhost.localdomain rocks.sodalite.user-daemon.desktop[36088]: cat: /var/run/rocks.sodalite.hacks/migrate-status: No such file or directory
Jul 25 04:03:13 localhost.localdomain rocks.sodalite.user-daemon.desktop[36090]: cat: /var/run/rocks.sodalite.hacks/migrate-status: No such file or directory

So the processes fold, tr and cat /dev/urandom started from the sodalite-migrate.service command line are somehow stuck in a loop and using around 30% CPU. This is all invoked from the service file with rocks.sodalite.hacks migrate --all.

I followed the quick instructions to rebase a ublue F38 install to Sodalite and it all seemed to go fine. I ended up in the Pantheon desktop, as expected. But this "migrate --all" thing has been pegging at least one core on a 4c8t Ryzen 3700u for hours while I investigate what it's supposed to be doing.

I don't have any idea what this is all about, but something apparently needs to handle this missing file more gracefully.

Stopping the sodalite-migrate.service doesn't seem to stop the messages in the log. It also doesn't terminate the offending processes automatically. Mostly it's fold using 12.5% (one thread) of the CPU forever.

Touching the missing file stops the messages in the journal, but won't stop fold/tr/cat from using CPU, so I have to manually kill fold.

electricduck commented 1 year ago

Hm, interesting. Not seeing this myself. If you're wondering what this even is, its to correct a few things that may cause some issues for Sodalite post-install. The error is coming by the fact the pidfile isn't being created for some reason, but its still trying to cat it (and ending up a loop somehow?). Seems easy enough to sort out, since yes, it does not handle this gracefully.

As for why it didn't create the pidfile in the first place, that's unusual. If you still have Sodalite installed, could you confirm if:

RedBearAK commented 1 year ago

I'll have to get back to you on that. I restored an earlier snapshot of the VM because I had some other testing to do, so it's back to being rebased on Silverblue-main, but I think I can restore the snapshot where it was rebased on Sodalite and this was happening, and check some things.

But I think the primary thing is just to keep the CPU core-killer infinite loop from happening, and have a meaningful error message coming out in the journal. Spent two hours wondering what "fold" thought it was doing that needed a whole core. (Just trying to create a randomized host name, apparently, according to GPT-4.)

RedBearAK commented 1 year ago

@electricduck

Reverted to VM snapshot from right after following the instructions to setup Sodalite. It is definitely Sodalite, Pantheon desktop.

Directory /var/lib/sodalite exists. Appears to be empty, including dot files.

Flatpak remotes:

[testuser@localhost ~]$ flatpak remotes
Name    Options
flathub system
flathub user
[testuser@localhost ~]$ 

Not having any real idea how all this immutable stuff actually works, all I can tell you is that I followed all the "Quickstart" instructions and rebooted, saw an elementary OS style login screen, and logged into a Pantheon desktop that looks pretty much like elementary OS. Other than this glitch and /var/lib/sodalite being empty, everything seems to work fine. I would not have known there was anything wrong with the install.

Trying to run the Quickstart instructions again:

[testuser@localhost ~]$ sudo ostree remote add --if-not-exists sodalite https://ostree.sodalite.rocks --no-gpg-verify
[sudo] password for testuser:           
[testuser@localhost ~]$ sudo ostree pull sodalite:sodalite/current/x86_64/desktop
2 metadata, 0 content objects fetched; 0 bytes content written                                                                                                                                                                               
[testuser@localhost ~]$ sudo rpm-ostree rebase sodalite:sodalite/current/x86_64/desktop
error: Old and new refs are equal: sodalite:sodalite/current/x86_64/desktop
[testuser@localhost ~]$ 

Seems to indicate that it doesn't think anything new should happen.

Oh, but when I kill fold a "Migrating" progress dialog appears on screen.

RedBearAK commented 1 year ago

@electricduck

Seems like it may be doing what was originally intended. It appears the be installing Flatpaks (taking forever probably due to downloading the platform/SDK/frameworks to support the apps). The gala process is taking around 65% CPU, with the zenity progress dialog taking a few percent. I had instaled btop (with --apply-live) just to see what was going on, so I'm seeing the activity in the terminal window behind the zenity dialog.

It's on io.elementary.calendar at this point. I assume this is the stuff that's supposed to happen if there had been no infinite loop glitch.

RedBearAK commented 1 year ago

@electricduck

OK, seems to be done, and this is the contents of the unattended apps file now:

+:pantheon:appcenter:io.elementary.Platform:7.1
+:pantheon:appcenter:org.gnome.FileRoller:stable
+:pantheon:appcenter:io.elementary.calculator:stable
+:pantheon:appcenter:io.elementary.calendar:stable
+:pantheon:appcenter:io.elementary.camera:stable
+:pantheon:appcenter:io.elementary.capnet-assist:stable
+:pantheon:appcenter:io.elementary.screenshot:stable
+:pantheon:appcenter:io.elementary.tasks:stable
+:pantheon:appcenter:io.elementary.videos:stable
+:pantheon:flathub:org.freedesktop.Platform:22.08

And rebooting...

Gala is very busy doing something for about 30 seconds after logging in... Other than that, things seem fine now.

Somehow, at some point, the random hostname thing seems to have succeeded:

[testuser@sodalite-g0dEDs ~]$ 

That's about all I can report.

electricduck commented 1 year ago

Eventually got to the bottom of this after doing a clean install on my own laptop. Seems like it kept getting caught on the hostname generation and just eating up memory. Have replaced this with a new method that utilizes Bash's $RANDOM variable, instead of calling on /dev/urandom.