Closed hacknisty closed 10 months ago
@vdnclodio Would it be possible to share more about your setup so I can attempt to reproduce the issue?
Certificate
manifest?@zachomedia
Configuration: (i only paste the uncommented options, every other options are set to their default value)
#################################
# default-soa-name name to insert in the SOA record if none set in the backend
#
# default-soa-name=a.misconfigured.powerdns.server
default-soa-name=letsencrypt.mydomain
#################################
# include-dir Include *.conf files from this directory
#
# include-dir=
include-dir=/etc/powerdns/pdns.d
#################################
# launch Which backends to launch and order to query them in
#
# launch=
launch=
#################################
# local-address Local IP addresses to which we bind
#
# local-address=0.0.0.0
local-address=w.x.y.z
#################################
# webserver Start a webserver for monitoring (api=yes also enables the HTTP listener)
#
# webserver=no
webserver=yes
api=yes
api-key=myapikey
#################################
# webserver-address IP Address of webserver/API to listen on
#
# webserver-address=127.0.0.1
webserver-address=w.x.y.z
#################################
# webserver-allow-from Webserver/API access is only allowed from these subnets
#
# webserver-allow-from=127.0.0.1,::1
webserver-allow-from=0.0.0.0/0
#################################
# webserver-password Password required for accessing the webserver
#
webserver-password=changeme
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-example
namespace: default
annotations:
ingress.kubernetes.io/rewrite-target: /
cert-manager.io/cluster-issuer: letsencrypt-staging
spec:
rules:
- host: k8s-example.mydomain
http:
paths:
- path: /apple
pathType: Prefix
backend:
service:
name: apple-service
port:
number: 5678
tls:
- hosts:
- k8s-example.mydomain
secretName: k8s-example-cert
I0120 16:28:18.629577 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0120 16:28:18.629542 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0120 16:28:18.629605 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0120 16:28:18.629786 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0120 16:28:18.629788 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0120 16:28:18.630121 1 secure_serving.go:266] Serving securely on [::]:443
I0120 16:28:18.630223 1 dynamic_serving_content.go:129] "Starting controller" name="serving-cert::/tls/tls.crt::/tls/tls.key"
I0120 16:28:18.629789 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0120 16:28:18.630392 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0120 16:28:18.630893 1 apf_controller.go:299] Starting API Priority and Fairness config controller
I0120 16:28:18.730130 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0120 16:28:18.730175 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0120 16:28:18.730589 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0120 16:28:18.731532 1 apf_controller.go:304] Running API Priority and Fairness config worker
Also :
curl -s -H 'X-API-Key: myapikey' http://w.x.y.z:8081/api | jq .
[
{
"url": "/api/v1",
"version": 1
}
]
curl -s -H 'X-API-Key: myapikey' http://w.x.y.z:8081/api/v1
Not Found
curl -s -H 'X-API-Key: myapikey' http://w.x.y.z:8081/api/v1/
Not Found
curl -s -H 'X-API-Key: myapikey' http://w.x.y.z:8081/api/v1/servers | jq .
[
{
"config_url": "/api/v1/servers/localhost/config{/config_setting}",
"daemon_type": "authoritative",
"id": "localhost",
"type": "Server",
"url": "/api/v1/servers/localhost",
"version": "4.1.6",
"zones_url": "/api/v1/servers/localhost/zones{/zone}"
}
]
@zachomedia
k8s-02.my.me.16658 > letsencrypt.8081: Flags [P.], cksum 0x7f3d (correct), seq 0:185, ack 1, win 507, options [nop,nop,TS val 3975310894 ecr 1822938651], length 185
E...].@.?..._..R_...A.......Cn.d.....=.....
..n.l...GET /api/v1/servers/localhost/zones HTTP/1.1
Host: w.x.y.z:8081
User-Agent: Go-http-client/1.1
X-Api-Key: myapikey
Accept-Encoding: gzip
10:53:34.698422 IP (tos 0x0, ttl 63, id 31692, offset 0, flags [DF], proto TCP (6), length 214)
...
k8s-02.my.me.44755 > letsencrypt.8081: Flags [P.], cksum 0xac1d (correct), seq 0:162, ack 1, win 507, options [nop,nop,TS val 3974830689 ecr 1822458464], length 162
E...{.@.?..._..R_..........F..l............
...al..`GET /api/v1/ HTTP/1.1
Host: w.x.y.z:8081
User-Agent: Go-http-client/1.1
X-Api-Key: myapikey
Accept-Encoding: gzip
any clues on this ? (my guess would be that no zone corresponding to my request has been found or somethign like that)
@zachomedia Ok so a bit more information about my setup : I use a delegated zone for acme challenge :
_acme-challenge IN NS letsencrypt.mydomain.com.
This setup already works for hundreds of domain using certbot, i just want to enable the same feature in my k8s cluster.
same as this :
https://cert-manager.io/docs/configuration/acme/dns01/#delegated-domains-for-dns01
So if i run the test suite with mydomain.com, i end up with the 404 error on /api/v1/, but if i run the test suite with _acme-challenge.mydomain.com i got this :
--- FAIL: TestRunsSuite (49.39s)
--- FAIL: TestRunsSuite/Conformance (43.11s)
--- FAIL: TestRunsSuite/Conformance/Basic (14.03s)
--- FAIL: TestRunsSuite/Conformance/Basic/PresentRecord (14.03s)
util.go:59: skipping file "testdata/pdns/README.md" with unrecognised extension
util.go:68: created fixture "basic-present-record"
suite.go:37: Calling Present with ChallengeRequest: &v1alpha1.ChallengeRequest{UID:"", Action:"", Type:"", DNSName:"example.com", Key:"123d==", ResourceNamespace:"basic-present-record", ResolvedFQDN:"cert-manager-dns01-tests._acme-challenge.mydomain.com.", ResolvedZone:"_acme-challenge.mydomain.com.", AllowAmbientCredentials:false, Config:(*v1.JSON)(0xc00000e048)}
suite.go:49: error waiting for DNS record propagation: Could not determine authoritative nameservers for "cert-manager-dns01-tests._acme-challenge.mydomain.com."
--- FAIL: TestRunsSuite/Conformance/Extended (15.39s)
--- FAIL: TestRunsSuite/Conformance/Extended/DeletingOneRecordRetainsOthers (15.39s)
util.go:59: skipping file "testdata/pdns/README.md" with unrecognised extension
util.go:68: created fixture "extended-supports-multiple-same-domain"
suite.go:103: error waiting for DNS record propagation: Could not determine authoritative nameservers for "cert-manager-dns01-tests._acme-challenge.mydomain.com."
FAIL
FAIL github.com/zachomedia/cert-manager-webhook-pdns 49.421s
FAIL
is this scenario supported by this webhook ?
Ah! That might be why, I have not tested this with the delegated domains feature. I'll test it locally and confirm I can reproduce the issue and then figure out how to fix it.
Thanks!
@vdnclodio Can you try changing your issuer as such:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
# ...
solvers:
- dns01:
cnameStrategy: Follow
webhook:
groupName: acme.zacharyseguin.ca
# ...
The cnameStrategy
seems to be what's needed according to the docs in order to get cert-manager to update the correct zone - and in my limited testing so far seems to fix the issue.
@zachomedia Unfortunately, no luck, still getting a 404 not found on /api/v1/.
But i can see the request coming to my delegated DNS server so the cname following part is working.(using tcpdump), i can see this :
[
{
"account": "",
"dnssec": false,
"id": "=5Facme-challenge.mydomain.com.",
"kind": "Native",
"last_check": 0,
"masters": [],
"name": "_acme-challenge.mydomain.com.",
"notified_serial": 0,
"serial": 11,
"url": "/api/v1/servers/localhost/zones/=5Facme-challenge.mydomain.com."
}, ...
]
Answer containing all my zones including _acme-challenge.mydomain.com as shown (but i guess this is where it failed. looks like it check for a zone mydomain.com and not _acme-challenge.mydomain.com. This part does not appear in the log, i don't know what is searched there). it seems to happen somewhere here : https://github.com/zachomedia/cert-manager-webhook-pdns/blob/71931033af2516713ebeb7123f4b2e3d0536ed88/provider/client.go#L80
And i suppose the UnFqdn function for k8s-example.mydomain.com return mydomain.com and not _acme-challenge.mydomain.com (this is just a guess, maybe you have a way for me to use a debug webhook ?)
Can you confirm what your DNS configuration is? I'm thinking that it's different than what cert-manager is expecting.
If I read https://cert-manager.io/docs/configuration/acme/dns01/#delegated-domains-for-dns01, I understand it as this:
_acme-challenge.example.com
should be a CNAME
record to some other domain, say _acme-challenge.challenges.example.com
. Then your pdns server should be authoritative for challenges.example.com
, and which it can then insert the correct record. (So _acme-challenge IN CNAME _acme-challenge.challenges.example.com
in your example.com zone)
Following this logic, I would expect that you have an entry in your example.com zone challenges IN NS letsencrypt.example.com
. Then that server has a zone defined for challenges.example.com
.
It can also be a NS (this is already working with certbot and they both follow the Acme DNS-01).
So I want a cert for k8s-example.mydomain.com,
I have my main domain name zone mydomain.com which contain a record like this :
_acme-challenge IN NS letsencrypt.mydomain.com.
I have a PowerDNS server at letsencrypt.mydomain.com hosting the _acme-challenge.mydomain.com zone And in this zone (the subdelegated zone) i have the TXT record to be updated by this webhook, which would validate DNS-01 request.
Unfortunately it appears that this is not supported by cert-manager: https://github.com/jetstack/cert-manager/issues/3453#issuecomment-725548578
And the issue was closed with:
Since cnameStrategy: Follow seems like a good way of working around the lack of "NS follow" support, I will close this issue. Feel free to re-open if you would like to expand.
Edit, to add: this webhook (and I'm sure all others) use the same functions to resolve the zone.
Ok, i will reopen this issue. Thanks a lot. As a temporary fix, i will make a custom build of your project, replacing the searched zone with the good one and i suppose it will just work (as i say i already have this setup working, and i can't change all of them to cname just like this ...), and i will wait for cert-manager to fix the issue.
Digging through the code I have found a few bugs in my implementation (likely unrelated to the issue), including the source of the 404 (which happens when it can't resolve a zone). It should return an error there.
I'll see if there is something I can implement as a setting to support this.
Sounds great, I'm available for test if needed.
@zachomedia So i made it works for the webhook side (my txt record is properly updated now).
But it still fail at propagation check. However if i do a dig _acme-challenge.mydomain.com txt
, i get the right challenge in the TXT record.
Do you, by any chance, know what is responsible to propagation check ?
Maybe i should force a DNS server for check ?
Well nevermind, i found it. Cert-manager is responsible for it. And no matter if i succeed updating the challenge, DNS propagation check does not follow NS record at cert-manager level
@vdnclodio I wonder if setting --dns01-recursive-nameservers-only
on your cert-manager deployment (from https://cert-manager.io/docs/configuration/acme/dns01/#setting-nameservers-for-dns01-self-check) would help with that, since rather than trying to locate the authoritative server it will look it up via the recursive server. Might cause a bit of a delay but it should get you past that part since I think it will skip that logic in cert-manager.
Unfortunately, it fails while querying the wrong SOA record. I guess i'm stuck until the issue is fixed upstream.
As nothing has been heard on this in a while, and since this is an upstream issue with cert-manager, I am going to close this ticket. Please don't hesitate to reach out if you have any further issues.
Hi,
I'm trying to setup cert-manager-webhook-pdns using the Helm chart.
The issuer has been defined this way :
The pdns-api-key has been properly setup with the corresponding PDNS api key.
I keep getting this error message in the cert-manager container, and no cert are delivered :
btw, http://pdns:8081/api/v1/ always return 404 not found in my setup, but querying a server or a zone is working properly.
how can i properly setup this plugin ?