markt-de / puppet-acme

Centralized SSL certificate management using acme.sh and the ACME protocol
https://forge.puppet.com/markt/acme
Apache License 2.0
9 stars 17 forks source link

Error for new certificates after upgrade to 2.1.0 #31

Closed philipfreude closed 4 years ago

philipfreude commented 4 years ago

After upgrading to version 2.1.0 and issuing a new certificate I get an error on my puppet master.

Error: Could not retrieve catalog from remote server: 
Error 500 on SERVER: Server Error: 
Could not find resource 'File[/etc/acme.sh/configs/gitlab.example.com/params.dh]' in parameter 'require' 
(file: /etc/puppetlabs/code/environments/production/modules/acme/manifests/deploy/crt.pp, line: 99) on node puppet.example.com

Not sure why this is happening..

After downgrading to 2.0.0 the error is gone.

fraenki commented 4 years ago

@phifre How are you using the module, could you paste your manifest/hiera code?

philipfreude commented 4 years ago

Sure!

I was trying to build a simple reverse proxy / load balancer that also terminates ssl.

Here is my profile:

class profiles::load_balancer (
  Hash[String, Hash] $servers = undef,
  String $acme_profile        = undef,
  String $acme_account        = undef,
) {
  include nginx
  include acme

  $servers.each | String $server, Hash $server_hash | {
    if !$server_hash['proxy'] {
      fail("Proxy destination must be defined.")
    }
    if $server_hash['ssl'] == true {
      acme::certificate { $server:
        use_profile    => $acme_profile,
        use_account    => $acme_account,
        letsencrypt_ca => 'production',
        notify         => Class['nginx::service'],
      }
      nginx::resource::server { $server:
        proxy        => $server_hash['proxy'],
        ssl          => true,
        ssl_redirect => true,
        ssl_cert     => "${acme::crt_dir}/${server}/cert.pem",
        ssl_key      => "${acme::key_dir}/${server}/private.key",
      }
    } else {
      nginx::resource::server { $server:
        proxy => $server_hash['proxy'],
      }
    }
  }
}

with the following example hiera data:

profiles::load_balancer::acme_profile: 'cloudflare'
profiles::load_balancer::acme_account: 'name@mail.com'
profiles::load_balancer::servers:
  'gitlab.example.com'
    proxy: 'http://internal.dns.name'
    ssl: true
fraenki commented 4 years ago

@phifre Thanks! Just to verify my assumption: is this node also running your Puppetserver, or is it a different node?

philipfreude commented 4 years ago

That's indeed a node different from the puppetserver.

fraenki commented 4 years ago

That's an interesting bug. It seems to date back to version 1.0. But I don't know which change caused this bug to surface.

Anyway, it seems odd that class acme::deploy::crt requires a file that is created by acme:csr in a previous Puppet run.

@phifre Can you confirm that https://github.com/fraenki/puppet-acme/commit/d77cb1808d56064ecd192fd81d17ccdef0588a3c fixes this issue?

philipfreude commented 4 years ago

Well it does not really fix it.. Now I get the following error on my Puppet Master node:

Error: /Stage[main]/Acme/Acme::Request::Crt[gitlab.example.com]/Acme::Deploy::Crt[gitlab.example.com]/Concat[/etc/acme.sh/certs/gitlab.example.com/fullchain.pem]/Concat_file[/etc/acme.sh/certs/gitlab.example.com/fullchain.pem]: Failed to generate additional resources using 'eval_generate': Could not retrieve source(s) /etc/acme.sh/configs/gitlab.example.com/params.dh
Error: /Stage[main]/Acme/Acme::Request::Crt[gitlab.example.com]/Acme::Deploy::Crt[gitlab.example.com]/Concat[/etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem]/Concat_file[/etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem]: Failed to generate additional resources using 'eval_generate': Could not retrieve source(s) /etc/acme.sh/keys/gitlab.example.com/private.key
Error: Could not set 'file' on ensure: No such file or directory @ rb_sysopen - /etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem
Error: Could not set 'file' on ensure: No such file or directory @ rb_sysopen - /etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem
Wrapped exception:
No such file or directory @ rb_sysopen - /etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem
Error: /Stage[main]/Acme/Acme::Request::Crt[gitlab.example.com]/Acme::Deploy::Crt[gitlab.example.com]/Concat[/etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem]/File[/etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem]/ensure: change from 'absent' to 'file' failed: Could not set 'file' on ensure: No such file or directory @ rb_sysopen - /etc/acme.sh/keys/gitlab.example.com/fullchain_with_key.pem

I hope I have some time to dig a little deeper in the next days..

Thanks for your help!

philipfreude commented 4 years ago

I think I have an idea why this is happening.

Let's say I need two different certificates. For sake of argument let's call them puppet.example.com and gitlab.example.com. The first certificate is for the puppetserver node. The seconds one is for a different node. The fact acme_crtscontains

gitlab.example.com
puppet.example.com

Now puppet is executed on the puppetserver. In line 137 of init.ppthese two certificate names are stored in acme_crts_array. One line after that ::acme::request::crtis called with this array. In line 33 of request/crt.pp acme::deploy::crt is called. This works fine with a certificate for the puppetserver because all required files are already there. However for a certificate for a different node for example the params.dhfile ($dh = "${cfg_dir}/${real_domain}/params.dh") does not exists on the puppetserver. The same is true for the fullchain or the fullchain_with_key.

I hope what I write makes sense..

philipfreude commented 4 years ago

I have fixed this in 35a0421. Please have a look and confirm this makes sense. It fixes the issue for me.

Can you confirm if requesting certificate from the puppetserver and other nodes is working for you before?

fraenki commented 4 years ago

@phifre TBH, this puzzles me. Nothing in version 2.1.0 could explain the issue that you have reported in your first post, AFAICT: https://github.com/fraenki/puppet-acme/compare/2.0.0...2.1.0

I've noticed that you have only provided the code that you use on "gitlab.example.com". Could you also provide the manifest/hiera code for "puppet.example.com"?

fraenki commented 4 years ago

Regarding your commit https://github.com/fraenki/puppet-acme/commit/35a042195d9aff5583c5698c64ebf0da878267ed...

Check if certificate was requested from current node and skip creation of chains if not. This is necessary because otherwise all nodes would try to create the chain.

Could you please elaborate further on this? The code should already prevent this:

https://github.com/fraenki/puppet-acme/blob/master/manifests/certificate.pp#L53-L56 https://github.com/fraenki/puppet-acme/blob/master/manifests/deploy.pp#L16-L19

To trigger this code, your node would need to have acme::certificate { 'gitlab.example.com': ...} defined somewhere. So if this is only defined on one node, then this code cannot run on any other node.

philipfreude commented 4 years ago

First of all, here is my puppetserver profile

class profiles::puppetserver (
  Boolean $use_ssl     = false,
  String $acme_profile = undef,
  String $acme_account = undef,
) {
  # Configure puppetdb and its underlying database
  include puppetdb
  # Configure the Puppet master to use puppetdb
  class { 'puppetdb::master::config':
    manage_report_processor => true,
    enable_reports => true,
  }
  # Configure puppetboard
  class { 'puppetboard':
    manage_git         => true,
    manage_virtualenv  => true,
    virtualenv_version => '3.6',
    enable_catalog     => true,
  }
  # Configure Apache on this server
  include apache
  class { 'apache::mod::wsgi':
    package_name     => 'libapache2-mod-wsgi-py3',
    mod_path         => '/usr/lib/apache2/modules/mod_wsgi.so',
  }
  if $use_ssl {
    # Issue certificate for puppetboard
    include acme
    acme::certificate { $::fqdn:
      use_profile    => $acme_profile,
      use_account    => $acme_account,
      letsencrypt_ca => 'production',
      notify         => Class['apache::service'],
    }
    # where acme stores our certificate and key files
    $my_key = "${acme::key_dir}/${::fqdn}/private.key"
    $my_cert = "${acme::crt_dir}/${::fqdn}/cert.pem"
    $my_chain = "${acme::crt_dir}/${::fqdn}/chain.pem"
    # Access Puppetboard through https://$::fqdn
    class { 'puppetboard::apache::vhost':
      vhost_name => $::fqdn,
      port       => 443,
      ssl        => true,
      ssl_cert   => $my_cert,
      ssl_key    => $my_key,
      subscribe  => Acme::Certificate[$::fqdn],
    }
    # Redirect requests to https
    apache::vhost { "${::fqdn}:80":
      servername      => $::fqdn,
      port            => 80,
      docroot         => '/var/www/redirect',
      redirect_status => 'permanent',
      redirect_dest   => "https://${::fqdn}"
    }
  } else {
    # Access Puppetboard through http://$::fqdn
    class { 'puppetboard::apache::vhost':
      vhost_name => $::fqdn,
      port       => 80,
    }
  }
}

with the following hiera data:

# Activate ssl for puppetboard
profiles::puppetserver::use_ssl: true
# Set acme profile and account
profiles::puppetserver::acme_profile: 'cloudflare'
profiles::puppetserver::acme_account: 'name@mail.com'
fraenki commented 4 years ago

@phifre Thanks! I guess you call the acme class somewhere else to set accounts and profiles?

(EDIT: @phifre if you're on IRC you could get in touch with me on Freenode or OFTC)

philipfreude commented 4 years ago

Well I have the include acme in this profile and the common.yaml sets the accounts and profiles:

....
acme::accounts: ['name@mail.com']
acme::profiles:
  'cloudflare':
    challengetype: 'dns-01'
    hook: 'cf'
    env:
      CF_Account_ID: 7xxxxxxxxxxxxxxxxxxxxa
      CF_Token: ENC[PKCS7,MIICmQ...x4Kr/HDqBmq1]
    options: {}
....
philipfreude commented 4 years ago

Regarding my commit 35a0421: The code lines you mentioned prevent certificates from being shipped to nodes that have not requested them. However the error I am seeing is on the puppetserver.

To my understanding init.pp#L138 is called for every certificate requested by some node. In my case $acme_crts_array = ['gitlab.example.com', 'puppet.example.com']. Now request/crt.pp#L33 is called with $domain = 'gilab.example.com' (still on the puppetserver). However in deploy/crt.pp#L78 the puppetserver tries to access the key which is only present on the node that requested the certificate.

This is why puppet on the puppetserver fails for me when any node (other than the puppetserver) has requested a certificate.

EDIT: How can I reach you there?

fraenki commented 4 years ago

Regarding IRC: fraenki :)

fraenki commented 4 years ago

Now request/crt.pp#L33 is called with $domain = 'gilab.example.com' (still on the puppetserver).

No, this code is not executed on your Puppetserver, it is a exported resource (note that it's prepended with @@): https://puppet.com/docs/puppet/6.17/lang_exported.html#syntax

fraenki commented 4 years ago

@phifre Thanks for taking the time to debug this issue! Let me summarize our discussion on IRC.

By looking at your PuppetDB we found something interesting:

  {
    "certname": "puppet.example.com",
    "type": "Acme::Deploy::Crt",
    "title": "gitlab.example.com",
    "tags": [
      "deploy",
      "acme::deploy::crt",
      "gitlab.example.com",
      "puppet.example.com",
      "crt",
      "class",
      "puppetserver",
      "profiles",
      "request",
      "acme",
      "profiles::puppetserver",
      "acme::request::crt",
      "roles::puppetserver",
      "roles",
      "node"
    ]
  },

One can see that tags contains both the name of the certificate gitlab.example.com and the name of your Puppetserver puppet.example.com. This explains why the module tries to deploy all certs on the wrong host – it finds a matching tag.

It looks like the nesting is the issue here: By including the acme module in your puppetserver role/profile, the name of your Puppetserver is automatically added to the tags of the exported resource. This also explains why I was unable to reproduce this issue: using the acme module directly does not trigger this issue.

This in turn reveals that the usage of tags in this module is flawed; using only the FQDN as tag identifier is not sufficient, because the FQDN could easily be added to tags by other modules/manifests.

I've modified all tags to use prefixes, this allows the module to better identify which exported resource should be run on the Puppetserver and which should be run only on the requesting node. Automatically added tags are no longer an issue.

BEFORE

In version 2.1.0 the tag only contains the FQDN:

puppet query "resources[certname, type, title, tags] { exported = true and type ~ 'Acme' and title ~ 'test' }"

[
  {
    "certname": "puppetserver.example.com",
    "type": "Acme::Deploy::Crt",
    "title": "test.example.com",
    "tags": [
      "deploy",
      "acme::deploy::crt",
      "crt",
      "class",
      "test.example.com",
      "request",
      "acme",
      "acme::request::crt"
    ]
  },
  {
    "certname": "test.example.com",
    "type": "Acme::Request",
    "title": "test.example.com",
    "tags": [
      "class",
      "acme::certificate",
      "acme::request",
      "puppetserver.example.com",
      "test.example.com",
      "certificate",
      "request",
      "acme",
      "acme::csr",
      "csr"
    ]
  }
]

AFTER

Now the tags are either prefixed with crt_ or master_ to reliably identify the target host for each exported resource:

puppet query "resources[certname, type, title, tags] { exported = true and type ~ 'Acme' and title ~ 'test' }"

[
  {
    "certname": "puppetserver.example.com",
    "type": "Acme::Deploy::Crt",
    "title": "test.example.com",
    "tags": [
      "deploy",
      "acme::deploy::crt",
      "crt",
      "class",
      "crt_test.example.com",
      "test.example.com",
      "request",
      "acme",
      "acme::request::crt"
    ]
  },
  {
    "certname": "test.example.com",
    "type": "Acme::Request",
    "title": "test.example.com",
    "tags": [
      "master_puppetserver.example.com",
      "class",
      "acme::certificate",
      "acme::request",
      "test.example.com",
      "certificate",
      "request",
      "acme",
      "acme::csr",
      "csr"
    ]
  }
]

@phifre, could you give https://github.com/fraenki/puppet-acme/commit/038f49493074267615e62b38bc47439d61789d9a a try?

philipfreude commented 4 years ago

Awesome! I can confirm it is indeed working now.

I tried with my existing configuration / certificates and the error did not come up on the puppetserver (so it seems backwards compatible) and I cleaned the PuppetDB to issue completely new certificates and in both cases it worked!

Thanks a lot for your time & help! Looking forward to the release ;)

fraenki commented 4 years ago

Version 2.2.0 was just released. @phifre Thanks again for your patience and time!