jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
16 stars 10 forks source link

[PuppetMaster] Migrate VM from OSUOSL to Azure #3613

Closed dduportal closed 1 year ago

dduportal commented 1 year ago

Service(s)

Azure, Other

Summary

Upgrade of the puppet.jenkins.io to Ubuntu 22.04 broke the Puppet Enterprise server in https://github.com/jenkins-infra/helpdesk/issues/2982#issuecomment-1570715518 as it Jammy is not supported by PE 🤦

This issue tracks the work to migrate the VM to an Azure Terraform-managed VM to restore the service (as we have backups taken before the Ubuntu migration).

Pros:

Cons:

Reproduction steps

No response

dduportal commented 1 year ago

Update:

dduportal commented 1 year ago

Puppet Server installation:

$ ls -l /var/lib/puppet/keys
total 8
-r-------- 1 pe-puppet root 1679 Jun  1 10:52 private_key.pkcs7.pem
-r-------- 1 pe-puppet root 1050 Jun  1 10:52 public_key.pkcs7.pem
# Check the Master hostname IS "puppet.jenkins.io" 

### 
Step 1 of 10: Stopping PE related services
# ...#
# Stuck at 10 of 10, because 
# - Service pe-puppetdb stuck during its startup: https://tickets.puppetlabs.com/browse/PDB-4785
# - Logs in /var/log/puppetlabs/puppetdb/puppetdb.log shows postgres is started, but the connection  puppetdb <-> postgres fails during TLS handshake (confirmed with tcpdump)
# - https://tickets.puppetlabs.com/browse/PDB-4625

Looking at https://www.puppet.com/docs/puppetdb/7/postgres_ssl.html#using-a-custom-java-keystore (yes, version 7 but the keystore is the same)

dduportal commented 1 year ago
Log messages will be saved to /var/log/puppetlabs/pe-backup-tools/pe_restore-2023-06-01_13.24.25_UTC.log

Step 1 of 10: Stopping PE related services
Step 2 of 10: Cleaning the agent certificates from previous PE install
Step 3 of 10: Restoring PE file system components
Step 4 of 10: Restoring the pe-orchestrator database
Step 5 of 10: Restoring the pe-rbac database
Step 6 of 10: Restoring the pe-classifier database
Step 7 of 10: Restoring the pe-activity database
Step 8 of 10: Restoring the pe-inventory database
Step 9 of 10: Restoring the pe-puppetdb database
Step 10 of 10: Configuring PE on newly restored master

Backup restored.
  Time to restore: 4 min, 6 sec
  Size: 2.26 GB, Scope: code, puppetdb, config, certs

To finish restoring your primary server from backup, run the following commands:
puppet agent --test
$ ls -l /root/.ssh/config /root/.ssh/deploy_key 
-rw-r--r-- 1 root root   55 Jun  1 10:57 /root/.ssh/config
-r-------- 1 root root 1679 Jun  1 10:57 /root/.ssh/deploy_key
$ cat /root/.ssh/config 
Host github.com
    IdentityFile /root/.ssh/deploy_key
$ ssh -T git@github.com
Hi jenkins-infra/jenkins-keys! You've successfully authenticated, but GitHub does not provide shell access.

$ r10k deploy environment --color --verbose --puppetfile
# No errors, WARN accepted
$ puppet agent --test
# ...
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Could not find class pe_console_prune for puppet.jenkins.io on node puppet.jenkins.io
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: Error 500 on SERVER: Server Error: Could not autoload puppet/reports/datadog_reports: Datadog report config file /etc/datadog-agent/datadog-reports.yaml not readable
dduportal commented 1 year ago

Update:

dduportal commented 1 year ago
dduportal commented 1 year ago

Closing as it works as expected