Closed jaawasth closed 3 years ago
@jaawasth Hi Jai, would be great to start a debugging session about the DMS issue with the test system you have setup. I can arrange for a meeting on Fri 16th, Mon 19th, Thu 21st between 3-7pm o'clock CEST. If you sent a teams event via ms@suse.com it will land in my calendar
Thanks
@jaawasth Thanks Jai for the debugging sessions. We have been able to fix all issues and will rebuild the DMS including the fixes. For customers to access the DMS packages (SLES15-Migration and suse-migration-sle15-activation) we need to run a new release to the SUSE namespace.
As this will take time I'm trying to speedup this process with maintenance.
I will do the submissions after we have clarity everything works as expected. Therefore I will provide the packages for the submission on the jump host from which you fetched the debugging versions too.
I'll let you know when they are there and you can do a final test without us attending, agreed ? On positive feedback I'll submit to SLES
Thanks
@schaefi, yes, this sounds good.
Thanks !!
@jaawasth ok the hopefully final version of the updated DMS packages are available on the jump host we used for debugging. Please fetch them and give it a try:
$ rpm -e suse-migration-sle15-activation SLES15-Migration
$ rpm -Uhv suse-migration-sle15-activation-2.0.23-6.21.1.noarch.rpm SLES15-Migration-2.0.23-6.x86_64.rpm
$ reboot
Let us know if everything works. I'll keep this open until your feedback arrives.
Again thanks much for walking with us through the system :+1:
@schaefi , sure thanks, i'll be able to give it a try later today and let you know how it goes.
Thanks !!
@schaefi , the migration rpms worked fine and the system got upgraded, any particular file you want from the system to ensure completeness ?
Great new, thanks for the feedback. I think I don't need further files. With that information I can now start the release of the DMS in SLES. Thanks this is a big step forward
@schaefi , we have 1 mroe environemnt where we are testing, and there are some issues there, can you please have a look at the logs. I'm not directly testing it, so I'll be taking a look at the systems today later in the day to find something missing here , attaching the logs below meanwhile distro_migration.log
this environemnt is different from where we tested this. Its on multipath but OS is installed on physical hard drives [rather than over FC] attached to the system in RAID-1
Oh I think there is another error, introduced by the cert patch. Thanks for the log it shows
● suse-migration-prepare.service - Prepare For Migration
Loaded: loaded (/usr/lib/systemd/system/suse-migration-prepare.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2021-04-22 19:45:23 UTC; 9s ago
Process: 6137 ExecStart=/usr/bin/suse-migration-prepare (code=exited, status=1/FAILURE)
Main PID: 6137 (code=exited, status=1/FAILURE)
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: trust_anchor
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: File "/usr/lib64/python3.6/shutil.py", line 245, in copy
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: copyfile(src, dst, follow_symlinks=follow_symlinks)
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: File "/usr/lib64/python3.6/shutil.py", line 120, in copyfile
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: with open(src, 'rb') as fsrc:
Apr 22 19:45:23 localhost suse-migration-prepare[6137]: FileNotFoundError: [Errno 2] No such file or directory: '/system-root/etc/pki/trust/anchors/rda-ca.pem'
Apr 22 19:45:23 localhost systemd[1]: suse-migration-prepare.service: Main process exited, code=exited, status=1/FAILURE
Apr 22 19:45:23 localhost systemd[1]: Failed to start Prepare For Migration.
Apr 22 19:45:23 localhost systemd[1]: suse-migration-prepare.service: Unit entered failed state.
Apr 22 19:45:23 localhost systemd[1]: suse-migration-prepare.service: Failed with result 'exit-code'.
I'll take a look
@jaawasth Hmm, I don't understand this. Our process does
os.listdir('/system-root/etc/pki/trust/anchors')
which returns rda-ca.pem
and next we call
shutil.copy('/system-root/etc/pki/trust/anchors/rda-ca.pem', '/etc/pki/trust/anchors/')
which fails saying /system-root/etc/pki/trust/anchors/rda-ca.pem
does not exist. Can you please check the contents of /etc/pki/trust/anchors
on the host to upgrade and check if there is something special. Maybe this file is a symlink ?
Thanks
@schaefi , yes i wanted to have a look at this file but unfortunately i dont have access to the servers right now, only will be possible later in the day.
These files are from certs installed by a HW tool we need to run on the system. I'll try to get access to system and check these.
Thanks Jai. I think it's important to understand why it failed. In the meantime I created a pull request that prevents the migration process from failing just because of the copy exception.
Nevertheless we need to understand why this happens and maybe come up with a better solution than just ignoring the error
Thanks
can you please provide a test rpm as well ?
@schaefi , correct this is a symlink.
can you please provide me the new test rpms , with latest change.
total 4 drwxr-xr-x 4 root root 38 Mar 10 2020 .. -rw-r--r-- 1 root root 1298 May 8 2020 rmt-server.pem drwxr-xr-x 2 root root 46 Nov 23 17:23 . lrwxrwxrwx 1 root root 27 Apr 19 16:04 rda-ca.pem -> /etc/rda/private/rda-ca.pem
That explains everything. I update the PR and hope @jesusbv can have a short look
can you please provide me the new test rpms , with latest change.
I will once the open PR was reviewed
@jaawasth Please find new packages on the jump host
Have a great weekend and stay safe
This time i'm migrating with HW recommended software intact
We add an iso media using which we install required HW recommended packages on sles.
Problem retrieving files from 'SFS-2.24'. Failed to mount iso:/?iso=/var/foundation-2.24-cd1-media-sles12sp4-x86_64.iso on : Unable to find iso filename on source media History:
i had added this repo for installing harware packages, and it was not able to find the iso medium for the repo. I removed that and it worked fine. But can we add a check here as well to skip on such failures. ?
@jaawasth I think this is one of the areas which are outside of the scope of the DMS (or any migration concept). The setup of the repositories (as remote or local settings) is a setup we value as the most important part of the system. If the DMS would ignore or add its own mechanism to deal with non reachable but configured software repositories, I would consider it a design mistake.
So the repo setup is considered to be there for a reason and the DMS is not eligible to mess with it.
i had added this repo for installing harware packages, and it was not able to find the iso medium for the repo. I removed that and it worked fine.
Exactly, and that action should be taken by somebody who know this is ok and will not harm.
Thoughts ?
sure, but from a migration perspective does a repo added by the end user matter ? we should not mess with the repo but is there a way to do it as a precheck during migration. So, if there is an external repo setup by the customer, we wanr him pre hand to remove those repos or add an option / flag to ignore those repos during migration ? Since, the end user doesnt know what errors are present it becomes a long process to identify what the error actually is, so do can we add alerts or ignore option ?
This is a good point. We can certainly add sort of a repo pre-check in the activation/run-migration code such that a warning can be issued prior starting the process. I will add that as a new issue
thanks for tracking this.
@schaefi , when would the offical rpm's be available ?
Hi Jai. The packages were released yesterday. the 2.0.24 versions of the packages should be available in the channels soon @aosthof Do you have more information when the packages will be actually available to customers ?
@schaefi @jaawasth The packages have been flagged internally as 'done' with a release date of 2021-05-04 but the packages are not available yet in the official public repos (at the time of writing this). I could be that the mirroring process to CDN is still in progress. I assume that the packages will be available within the next couple of hours.
That said, I'm also checking with our Maintenance team to make sure nothing went wrong.
Thanks @aosthof @schaefi , please do keep me posted.
@jaawasth The packages are now publicly available. Kudos to @aosthof who managed to get it published
Thanks both @schaefi & @aosthof for the help.
Hi Marcus,
As a continuation to our chat on the issue https://github.com/SUSE-Enceladus/azure-li-services/issues/266
filing a bug with this project. please see above issue on MS images for more details.
Thanks !!