rocky-linux / rocky-tools

MIT License
392 stars 136 forks source link

Prevent accidental Rocky to Rocky migration #57

Open bcomisky opened 3 years ago

bcomisky commented 3 years ago

I had a error running migrate2rocky due to a RPM conflict I needed to resolve before distro-sync would run. After fixing the RPM issue I naively ran the migrate2rocky script again. But since my system was already considered Rocky at that point, the script got confused and ended up deleting some RPMs I needed before it failed.

Log from first attempt, failed distro-sync due to RPM conflict: https://pastebin.com/hg1KSbRL

Log from second attempt, with Rocky->Rocky failure: https://pastebin.com/mxsi1PVf

pajamian commented 3 years ago

The first attempt looks to me like it would have failed a dnf update if one had been attempted before the migration. I've seen a few cases of issues where dnf update would have failed and my determination is that if you can't successfully run dnf update then it's an indication that the distro-sync stage of the migration will also fail. So the solution here is to have the script attempt to run dnf update before it starts the migration which will give it a chance to bail before it does anything that could leave the system in an unstable state.

The second attempt is, as you say, because you tried to run migrate2rocky a second time after the first attept was partially successful. This can easily be solved by checking the os-release file to see if the current system is already rockylinux and bailing if it is. On a side note here a second attempt could mean that another distro-sync and following commands is being requested, and as such we may want to have some sort of option to skip the initial system package swap and just do the distro-sync and efi commands.

bcomisky commented 3 years ago

Yes, you are exactly right on all counts. I imagine I'm not the only one who will impulsively pull the trigger on migrate before doing their due diligence, so some safeguards would surely help someone else. For reference here, a manual download of the missing rocky-* RPMS, followed by the distro-sync and fix_efi steps from the migrate2rocky script fixed my installation.

pajamian commented 3 years ago

Just something else I thought of, in addition to that an option that basically does the manual install of the rpms, distro sync and fix_efi steps something like, "fix failed migration" would help for those who do end up, for whatever reason, with a partial migration.

pajamian commented 3 years ago

The first attempt looks to me like it would have failed a dnf update ... So the solution here is to have the script attempt to run dnf update before it starts the migration

This is done now with PR #58

pajamian commented 3 years ago

The second attempt is, as you say, because you tried to run migrate2rocky a second time after the first attept was partially successful. This can easily be solved by checking the os-release file to see if the current system is already rockylinux and bailing if it is.

Done with PR #64

systemcrash commented 3 years ago

Allowing the script to pick up where it left off would be good... like it drops markers in a .dot file to say which steps it completed.

pajamian commented 3 years ago

See https://github.com/rocky-linux/rocky-tools/issues/69#issuecomment-876450537

lbdroid commented 2 years ago

So per https://github.com/rocky-linux/rocky-tools/issues/69 -- I've done another migration from Centos 8 to Rocky, and experienced another failure. 2 failures per 2 attempts on different servers with different configurations is really telling me that I'm glad I only have 2 servers.

The second failure was for a different cause; I have a requirement to run janus-gateway, which I have built as an RPM as per https://github.com/NethServer/janus-gateway -- but slightly modified to bring it to version 0.11.3. The issue is that it depends on libnice-devel version >= 0.1.16 (I'm using 0.1.17), but the highest version available in Rocky repositories is 0.1.14. So the updater wanted to DOWNGRADE libnice and libnice-devel to 0.1.14, which doesn't satisfy the requirement for janus-gateway.

Sorry I don't have the original log file.

I rescued the migration by uninstalling janus-gateway, then running dnf -y distro-sync, re-upgrading libnice[-devel], reinstalling janus-gateway, and finally rebooting.

pajamian commented 2 years ago

Wherever you got libnice from it wasn't CentOS, then, the latest version in CentOS 8 is 0.1.14. Where did you get libnice from?

lbdroid commented 2 years ago

Here: https://koji.fedoraproject.org/koji/buildinfo?buildID=1657236 But the source of it really isn't the point. The point is that this situation broke the update. It could be any package that depends on a higher version of some package than what is available from the repositories.

I wouldn't necessarily expect the migration to be able to handle all the corner cases that could halt it, but having the ability to resume a migration would adequately address this and other corner case situations since it would provide the opportunity for corrective actions to be taken without needing the end user to actually read and understand the migration script and perform the final steps manually.