Expiration of the Certificate Revocation List (CRL) is fatal to
communication between Puppet Enterprise components, resulting
in a complete outage of service. Puppet 8 sets the crl_refresh_interval
to 1 day by default so that agents will pull in updates to the
CRL file.
However, Puppet Server does not ensure CRL entries are updated
on a regular cadence. In most installations, there is some
level of turnover in the agent population which results in
CRL updates. But, PE enables the infrastructure CRL which
is only updated by the addition or removal of a compiler node.
Additionally, Puppet 6 adds a "Root CA" with an associated CRL
for which no update workflow exists.
Without automated updates to ensure CRLs are refreshed,
every Puppet installation is at risk of a complete outage
when this component expires.
Reproduction Case
Obtain a RHEL 8 VM.
Install PE 2021.7.2.
Ensure CRL refresh is enabled:
/opt/puppetlabs/bin/puppet config set crl_refresh_interval 1d
Create and destroy a certificate to update leaf CRLs with a 5 year expiration:
# Stop puppet agent to prevent management of infra_inventory.txt
systemctl stop puppet
/opt/puppetlabs/bin/puppetserver ca generate --certname foo.example
printf '\nfoo.example\n' >> /etc/puppetlabs/puppetserver/ca/infra_inventory.txt
/opt/puppetlabs/bin/puppetserver ca clean --certname foo.example
Disable clock synchronization and then set the system forward to within
30 days of CRL expiration:
timedatectl set-ntp false
# Additionally, if VM is hosted by vSphere
vmware-toolbox-cmd timesync disable
# Check CRL expiration. Currently hard-coded to 5 years for CRLs generated
# by the Puppet Server process.
openssl crl -in "$(puppet config print cacrl)" -noout -nextupdate
timedatectl set-time "$(date --date '1800 days' +'%Y-%m-%d %H:%M:%S')"
Re-start Puppet Server and run the agent:
systemctl restart pe-puppetserver
puppet agent -t
Advance the system clock another 30 days and run the agent:
Outcome
The agent run fails due to an expired CRL:
# puppet agent -t
Info: Refreshing CRL
Error: certificate verify failed [CRL has expired for CN=deluxe-mile.delivery.puppetlabs.net]
Error: certificate verify failed [CRL has expired for CN=deluxe-mile.delivery.puppetlabs.net]
Expected Outcome
At service start, and on a regular interval, Puppet Server updates any CRL
that is within 30 days of expiration.
The example above only presents the expiration of the leaf CRL, but the
CRL from the "Puppet Root CA" must also be considered. Puppet Server
should refresh any CRL in the chain for which it has access to the
corresponding private key.
Expiration of the Certificate Revocation List (CRL) is fatal to communication between Puppet Enterprise components, resulting in a complete outage of service. Puppet 8 sets the crl_refresh_interval to 1 day by default so that agents will pull in updates to the CRL file.
However, Puppet Server does not ensure CRL entries are updated on a regular cadence. In most installations, there is some level of turnover in the agent population which results in CRL updates. But, PE enables the infrastructure CRL which is only updated by the addition or removal of a compiler node. Additionally, Puppet 6 adds a "Root CA" with an associated CRL for which no update workflow exists. Without automated updates to ensure CRLs are refreshed, every Puppet installation is at risk of a complete outage when this component expires.
Reproduction Case
Ensure CRL refresh is enabled:
Create and destroy a certificate to update leaf CRLs with a 5 year expiration:
Re-start Puppet Server and run the agent:
Advance the system clock another 30 days and run the agent:
Outcome The agent run fails due to an expired CRL: