SUSE / velum

Dashboard for CaaS Platform clusters (v1, v2 and v3)
https://www.suse.com/
Apache License 2.0
54 stars 30 forks source link

[3.0] enable update channels #654

Closed MaximilianMeister closed 5 years ago

MaximilianMeister commented 5 years ago

feature#update-channels

Signed-off-by: Maximilian Meister mmeister@suse.de

Depends on

https://github.com/kubic-project/salt/pull/658 https://github.com/thkukuk/update-checker

Backport of

https://github.com/kubic-project/velum/pull/638

MaximilianMeister commented 5 years ago

@vitoravelino removed the css bit again, but i need to test the failure case again, i suspect the btn link will still appear in the failure message. i also added some spec for the mirror sync check, and i managed to get the other test green, by simply setting all nodes to tx_update_mirror_synced: true not only the admin... thanks a lot for pinning down this issues!

vitoravelino commented 5 years ago

I've pulled the latest changes and what I noticed is that the admin node migration doesn't seem to be happening anymore. The admin node is not rebooting. So I'm able to refresh the page and click on the link again and it fails because there's already an orchestration on going (had to check the devtools). I followed the same instructions as before. Anything changed? Is the no reboot now expected?

MaximilianMeister commented 5 years ago

Anything changed? Is the no reboot now expected?

nothing changed, it looks like your salt orch failed? then usually the event processor should set the orchestration to failed, not sure why that didnt happen in your case.

keep in mind it takes a while, you can follow the progress in /var/log/migration* - what we could do is to hide all migration buttons in case a migration is in progress, in case the user falsely reloads the page during the admin migration. or we could also show a different message that indicates a migration is ongoing, and no button... but we'd need to export this migration_in_progress grain somehow through the minion model, which i think is not a good idea, as it's only temporary, and we'd have to do the same for update_in_progress - i'll try to fix this by using the pending highstate on the admin as an indicator

here are the instructions again:

# register
docker exec -it $(docker ps | grep "salt-master_velum" | awk '{print $1}') salt "*" cmd.run "SUSEConnect -r [CODE]"

# install 3.0 maintenance updates
docker exec -i $(docker ps | grep salt-master | awk '{print $1}') salt -P "roles:(admin|kube-(master|minion))" cmd.run '/usr/sbin/transactional-update cleanup dup salt'
docker exec -i $(docker ps | grep salt-master | awk '{print $1}') salt '*' saltutil.refresh_grains
docker exec -it $(docker ps | grep "velum-dashboard" | awk '{print $1}') entrypoint.sh bash -c "export RAILS_ENV=production; bundle exec rails runner 'Minion.update_grains'"
# reboot via velum

## add update-checker-migration
# on the host, download these rpm's and copy them to the nodes
for i in 1.0 2.0 3.0; do scp ~/Downloads/update-checker-1.0+git20180905.814bdd8-1.1.noarch.rpm root@10.17.$i:/root/; done

# on each node install the packages and reboot
transactional-update cleanup pkg install --oldpackage /root/*.rpm
# choose (2) break perl dependency, which is not used by update-checker-migration
reboot

# adapt the update-checker conf to write grains
docker exec -i $(docker ps | grep salt-master | awk '{print $1}') salt -P "roles:(admin|kube-(master|minion))" cmd.run "sed -i -e 's|output=.*|output=salt|g' /etc/update-checker.conf"
# write the grains
docker exec -i $(docker ps | grep salt-master | awk '{print $1}') salt -P "roles:(admin|kube-(master|minion))" cmd.run "update-checker-migration"
docker exec -i $(docker ps | grep salt-master | awk '{print $1}') salt '*' saltutil.refresh_grains
docker exec -it $(docker ps | grep "velum-dashboard" | awk '{print $1}') entrypoint.sh bash -c "export RAILS_ENV=production; bundle exec rails runner 'Minion.update_grains'"
MaximilianMeister commented 5 years ago

added another fix: when user reloads the page during migration, there was a short timeframe, where tx_update_reboot_needed was set, and therefore the UPDATE ADMIN NODE link popped up

also got rid of some DRY in the notification selectors by putting the notification cleanup in a reusable function

MaximilianMeister commented 5 years ago

rebased

jordimassaguerpla commented 5 years ago

Can we have an approval on this one?

jordimassaguerpla commented 5 years ago

Hi. I am sorry but we are removing the release-3.0 branch. Due to change of priorities, I don't think this fix will be necessary for 3.0 so I am closing this PR.