sasa1977 / site_encrypt

Integrated certification via Let's encrypt for Elixir-powered sites
MIT License
470 stars 34 forks source link

Unable to renew certificate due to ca error #43

Open axelson opened 2 years ago

axelson commented 2 years ago

This may not be a direct site_encrypt issue but I'm filing it here because the code is being called by site_encrypt and other site_encrypt users may run into it.

On startup and on force_certify my Phoenix application gets either this error:

Apr 14 07:27:37: 07:27:37.178 [notice] TLS :client: In state :wait_cert_cr at ssl_handshake.erl:2074 generated CLIENT ALERT: Fatal - Unknown CA
Apr 14 07:27:37: 07:27:37.179 [error] Task #PID<0.3365.0> started from #PID<0.3136.0> terminating
Apr 14 07:27:37: ** (MatchError) no match of right hand side value: {:error, %Mint.TransportError{reason: {:tls_alert, {:unknown_ca, 'TLS client: In state wait_cert_cr at ssl_handshake.erl:20> Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/http_client.ex:38: SiteEncrypt.HttpClient.request/3
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:280: SiteEncrypt.Acme.Client.API.http_request/4
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:233: SiteEncrypt.Acme.Client.API.jws_request/5
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:123: SiteEncrypt.Acme.Client.API.fetch_kid/1
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:30: SiteEncrypt.Acme.Client.for_existing_account/3
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/certification/native.ex:46: SiteEncrypt.Certification.Native.new_cert/3
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/certification/job.ex:15: SiteEncrypt.Certification.Job.certify/1
Apr 14 07:27:37: (site_encrypt 0.4.2) lib/site_encrypt/certification/job.ex:26: SiteEncrypt.Certification.Job.certify_and_apply/1
Apr 14 07:27:37: Function: #Function<0.109640683/0 in SiteEncrypt.Certification.Job.child_spec/1>
Apr 14 07:27:37: Args: []

Or this one:

Apr 13 08:03:05: 08:03:05.498 [info] TLS :client: In state :wait_cert_cr at ssl_handshake.erl:1899 generated CLIENT ALERT: Fatal - Unknown CA Apr 13 08:03:05: 08:03:05.498 [error] Task #PID<0.3355.0> started from #PID<0.3125.0> terminating Apr 13 08:03:05: ** (MatchError) no match of right hand side value: {:error, %Mint.TransportError{reason: {:tls_alert, {:unknown_ca, 'TLS client: In state> Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/http_client.ex:38: SiteEncrypt.HttpClient.request/3 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:280: SiteEncrypt.Acme.Client.API.http_request/4 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:233: SiteEncrypt.Acme.Client.API.jws_request/5 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client/api.ex:164: SiteEncrypt.Acme.Client.API.authorization/2 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:125: SiteEncrypt.Acme.Client.validate_authorizations/2 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:144: SiteEncrypt.Acme.Client.poll/4 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:74: SiteEncrypt.Acme.Client.process_new_order/3 Apr 13 08:03:05: (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:45: SiteEncrypt.Acme.Client.create_certificate/2 Apr 13 08:03:05: Function: #Function<0.109640683/0 in SiteEncrypt.Certification.Job.child_spec/1> Apr 13 08:03:05: Args: []

This is a partially related Mint issue: https://github.com/elixir-mint/mint/issues/337

I'm getting this error on a Ubuntu Server host and I'm not getting it on my Nerves host.

So far I've tried:

All the unknown_ca errors that I've seen online all point to updating cerifi/castore/hackney which I have tried.

Unfortunately I'm out of ideas at this point. Has anyone else run into this before?

Hermanverschooten commented 2 years ago

I am having the same issue.

Hermanverschooten commented 2 years ago

I just updated with apt, there was a new ca-certificates, but it I am now getting:

Apr 22 13:11:33 dashboard dashboard[1668582]: 13:11:33.801 [info] Ordering a new certificate for domain dashboard.gratwifi.eu (CA acme-v02.api.letsencrypt.org)
Apr 22 13:15:12 dashboard dashboard[1668582]: 13:15:12.296 [error] Task #PID<0.5862.0> started from #PID<0.5629.0> terminating
Apr 22 13:15:12 dashboard dashboard[1668582]: ** (MatchError) no match of right hand side value: {:error, #SiteEncrypt.Acme.Client.API.Session<https://acme-v02.api.letsencrypt.org/directory>}
Apr 22 13:15:12 dashboard dashboard[1668582]:     (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:74: SiteEncrypt.Acme.Client.process_new_order/3
Apr 22 13:15:12 dashboard dashboard[1668582]:     (site_encrypt 0.4.2) lib/site_encrypt/acme/client.ex:45: SiteEncrypt.Acme.Client.create_certificate/2
Apr 22 13:15:12 dashboard dashboard[1668582]:     (site_encrypt 0.4.2) lib/site_encrypt/certification/native.ex:52: SiteEncrypt.Certification.Native.create_certificate/2
Apr 22 13:15:12 dashboard dashboard[1668582]:     (site_encrypt 0.4.2) lib/site_encrypt/certification/job.ex:15: SiteEncrypt.Certification.Job.certify/1
Apr 22 13:15:12 dashboard dashboard[1668582]:     (site_encrypt 0.4.2) lib/site_encrypt/certification/job.ex:26: SiteEncrypt.Certification.Job.certify_and_apply/1
Apr 22 13:15:12 dashboard dashboard[1668582]:     (elixir 1.13.3) lib/task/supervised.ex:89: Task.Supervised.invoke_mfa/2
Apr 22 13:15:12 dashboard dashboard[1668582]:     (stdlib 3.17) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Apr 22 13:15:12 dashboard dashboard[1668582]: Function: #Function<0.109640683/0 in SiteEncrypt.Certification.Job.child_spec/1>
Apr 22 13:15:12 dashboard dashboard[1668582]:     Args: []
Hermanverschooten commented 2 years ago

My problem was solved with the apt update, I had added a name to my certificate and name resolution was not working.

sasa1977 commented 2 years ago

My problem was solved with the apt update, I had added a name to my certificate and name resolution was not working.

Just to make clear I understand, it is working now, and the fix was updating the ca-certificates package?

sasa1977 commented 2 years ago

@axelson You could try making a request from the OS, e.g. with curl https://acme-v02.api.letsencrypt.org/directory, to check if the endpoint is reachable. If that works, you could try to do the same from Elixir using pure mint. I'd expect one of those two should fail.

FWIW, this library is used for https://www.theerlangelist.com/. I just double checked, and the cert has been renewed about a week ago, which I think proves that this library is working. Now admittedly, I haven't updated this project's stack for quite some time, so it is possible that your problem is caused by a combination of the latest OTP/Elixir/Mint. But I'd first try to check at the OS level.

Hermanverschooten commented 2 years ago

Yes it was resolved after updating ca-certificates to the latest version.

axelson commented 2 years ago

@Hermanverschooten I'm surprised that updating the system package for ca-certificates resolved the problem for you. It looks to me that Mint should be using :castore for the certificates:

https://hexdocs.pm/mint/1.4.1/Mint.HTTP.html#connect/4-transport-options

:cacertfile - if :verify is set to :verify_peer (the default) and no CA trust store is specified using the :cacertfile or :cacerts option, Mint will attempt to use the trust store from the CAStore package or raise an exception if this package is not available. Due to caching the :cacertfile option is more efficient than :cacerts.

(SiteEncrypt is setting :verify to :verify_peer and not setting a specific :cacertfile).

I had worked around this temporarily by modifying SiteEncrypt to use HTTPoison/Hackney. But after switching back to SiteEncrypt 0.4.2 today both a dry run and a force_certify have both worked.

I'm not sure what conclusions to draw from that.

@axelson You could try making a request from the OS, e.g. with curl https://acme-v02.api.letsencrypt.org/directory, to check if the endpoint is reachable. If that works, you could try to do the same from Elixir using pure mint. I'd expect one of those two should fail.

Both of those are succeeding today (this is the Mint command I ran: {:ok, conn} = Mint.HTTP.connect(:https, "acme-v02.api.letsencrypt.org", 443, [mode: :passive, transport_opts: [verify: :verify_peer]]))

For now I'll just monitor my site (although I suppose it won't renew the certificate for a while)

aptinio commented 2 years ago

I switched to using certbot and found out that the challenge failed verifying mydomain.com even if I was trying to get certs for mysubdomain.mydomain.com. After making mydomain.com reachable (for now, by adding A and AAAA records for it pointing to my server), it worked for both cerbot and native.