Closed n3wscott closed 3 years ago
Has this issue been resolved @n3wscott ? I noticed you mentioned that it worked in the Slack channel threads...
Maybe it was a case of not having a good backup or the particular latest wal-e backup was corrupt when the update was tried?
The failures could have been related to the invalid database password that we found, in order to do the upgrade from Deis Workflow to Hephy Workflow since K8s control plane had already been upgraded past K8s 1.16, the upgrade to latest Hephy Workflow (now with built-in support for K8s v1 APIs) would have failed to start controller and connect to database from it.
So we did helm uninstall / helm reinstall, after taking backups of every configmap, secret, deployment, daemonset, etc., helm get values [release-name]
, service, ingress, anything else that might be needed to put Deis back the way it was.
When it came back online after reinstall, the database database-creds
had apparently been wiped and overwrote by a new generated credential, but s3 credentials were correct so database restore succeeded. (The new creds that did not match the creds required by database backup restored.)
But this was no problem, restoring the secret database-creds
worked and controller connected successfully.
I'm not sure though. The errors given in @n3wscott's top post weren't in the controller, they are from postgres database.
Did you try upgrading to v2.7.6 again after that password issue was all resolved? Did it turn out to have anything to do with the database-creds
issue, or is it still something else?
Sounds like database-creds
was the issue here and may have been a configuration issue during the upgrade process... If we don't heard back, we should close this.
Thanks for the comments, let me try upgrading back to v2.7.6 and report back.
@n3wscott , never heard back anything. If it is still an issue, feel free to reopen this issue.
Totally could be an edge case, but I am upgrading from a v2.12 install of workflow and I am hitting a looping error in the database controller:
These error loops seemingly forever.
And v2.7.3 seems to boot after it preformed an upgrade.