freeipa / freeipa-healthcheck

Check the health of a freeIPA installation
GNU General Public License v3.0
50 stars 28 forks source link

Errors with IPADogtagCertsMatchCheck #253

Closed andreasdijkman closed 2 years ago

andreasdijkman commented 2 years ago

Running: Oracle Linux 8.5 Package-versions:

ipa-server-4.9.6-10.0.1.module+el8.5.0+20451+6c55862e.x86_64
ipa-healthcheck-0.7-6.module+el8.5.0+20379+1b4496cf.noarch
ipa-healthcheck-core-0.7-6.module+el8.5.0+20379+1b4496cf.noarch

In #193 there has been some remarks about string-replacements. I'm getting the following messages on both my IPA-nodes:

  {
    "source": "ipahealthcheck.ipa.certs",
    "check": "IPADogtagCertsMatchCheck",
    "result": "ERROR",
    "uuid": "f5f78652-4ec5-451e-b7b9-ae22123c52ef",
    "when": "20220301213649Z",
    "duration": "0.786170",
    "kw": {
      "key": "ocspSigningCert cert-pki-ca",
      "nickname": "ocspSigningCert cert-pki-ca",
      "dbdir": "/etc/pki/pki-tomcat/alias",
      "msg": "{nickname} certificate in NSS DB {dbdir} does not match entry in LDAP"
    }
  },
  {
    "source": "ipahealthcheck.ipa.certs",
    "check": "IPADogtagCertsMatchCheck",
    "result": "ERROR",
    "uuid": "c2ec4ec5-4a08-4c0d-8111-3976e24befe6",
    "when": "20220301213649Z",
    "duration": "0.860105",
    "kw": {
      "key": "subsystemCert cert-pki-ca",
      "nickname": "subsystemCert cert-pki-ca",
      "dbdir": "/etc/pki/pki-tomcat/alias",
      "msg": "{nickname} certificate in NSS DB {dbdir} does not match entry in LDAP"
    }
  },
  {
    "source": "ipahealthcheck.ipa.certs",
    "check": "IPADogtagCertsMatchCheck",
    "result": "ERROR",
    "uuid": "34c10ce3-e8a2-4b4f-b68d-c160511e335c",
    "when": "20220301213649Z",
    "duration": "0.932143",
    "kw": {
      "key": "auditSigningCert cert-pki-ca",
      "nickname": "auditSigningCert cert-pki-ca",
      "dbdir": "/etc/pki/pki-tomcat/alias",
      "msg": "{nickname} certificate in NSS DB {dbdir} does not match entry in LDAP"
    }
  },
  {
    "source": "ipahealthcheck.ipa.certs",
    "check": "IPADogtagCertsMatchCheck",
    "result": "ERROR",
    "uuid": "666e0461-95e9-409a-afde-186ff0e7f0be",
    "when": "20220301213649Z",
    "duration": "1.002361",
    "kw": {
      "key": "Server-Cert cert-pki-ca",
      "nickname": "Server-Cert cert-pki-ca",
      "dbdir": "/etc/pki/pki-tomcat/alias",
      "msg": "{nickname} certificate in NSS DB {dbdir} does not match entry in LDAP"
    }
  }

I can't find what to do about these messages or what's actually wrong. Any pointers about where to look?

The command getcert list isn't showing anything stuck or not MONITORING. This message appears after upgrading IPA from 4.9.2 to 4.9.6 and freeipa-healthcheck from 0.7-3 to 0.7-6.

Also had to use the workaround mentioned here: https://pagure.io/freeipa/issue/9041

rcritten commented 2 years ago

There are no string replacements. The "msg" field was abstracted to make translations possible.

It means that the LDAP view of the certificates is different than what is in the CA NSS database. You should look on all your IPA servers to determine which one is correct (most likely the most recently issued certificate).

It could point, for example, to a renewal problem where one or more servers did not get an updated renewed certificate.

andreasdijkman commented 2 years ago

I've looked in IPA under Authentication -> Certificates and the names (Subject) match according to the output of getcert list on both nodes. Could the organisation-name in the check be a problem ? Ours has a . in the name.

Checking the subjects of the certificates in ou=certificateRepository,ou=ca,o=ipaca in the LDAP-server and the info in getcert list seem to match.

Maybe also tweak the output of line 725 from '{dbdir} does not match entry in LDAP')) to '{dbdir} does not match entry in LDAP by subject')) would be nice, because the message is the same as on line 704, which is confusing.

flo-renaud commented 2 years ago

Hi,

the check is validating that the certificate is identical, not only its subject. For instance for the ocsp subsystem cert, in order to get the ASCII format of the cert stored in the NSS database:

certutil -L -d /etc/pki/pki-tomcat/alias -n 'ocspSigningCert cert-pki-ca'

-a -----BEGIN CERTIFICATE-----

MIID4TCCAkmgAwIBAgIBAjANBgkqhkiG9w0BAQsFADAzMREwDwYDVQQKDAhJUEEuVEVTVDEeMBwGA1UEAwwVQ2VydGlmaWNhdGUgQXV0aG9yaXR5MB4XDTIyMDEyMTE1NTUxMloXDTI0MDExMTE1NTUxMlowLDERMA8GA1UECgwISVBBLlRFU1QxFzAVBgNVBAMMDk9DU1AgU3Vic3lzdGVtMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzbAfq3nDYgJ04gVZ3P7QbPLC3dLsD4EfcSCyS9I045y1XOE+7pfPZq6+RX8Mj5bdDCjJgYo6MimfkrVdPFZ3QrppPyt9wmKFXX812BYuNPevNfCcXsiuBpqH3PJE3c8YayanMaVTAtAL7SAtN/TMsBOdaRGbZMiem6xnA04ccVXuU5ipIgOQkJu1oBcRQfHhRWmzQqd73Os552kqNZEqAxUsGyyqxiFzu4hW5uQdynko9m8+9ZJphj3dh6saLLQ1qx7u4YxMRquw8kNp7EY9nB+MFUMEU36ptpXxH2WwBZ3CW2lyZ5Tgv94WxUGoB7B72IoKJlvQE8uvDv22HPNssQIDAQABo4GGMIGDMB8GA1UdIwQYMBaAFEupcV7MXSeEeJEAoNSgequZYS09MDoGCCsGAQUFBwEBBC4wLDAqBggrBgEFBQcwAYYeaHR0cDovL2lwYS1jYS5pcGEudGVzdC9jYS9vY3NwMBMGA1UdJQQMMAoGCCsGAQUFBwMJMA8GCSsGAQUFBzABBQQCBQAwDQYJKoZIhvcNAQELBQADggGBADWC/TDB2ZWQ0W7DYI/x3fZKkZCnIrq4KMF6PEbWfZxyUfWRE3EIkgNJaxPmGpZJgFebWQqnmcl8v7Ce5ZF9om0nhdzdcR+hwqWEmTtgYKpTuNIgfahNjqMCm/+LocnJ1cZCdirSPocoOxBHaz96Z6GgWxAj8K7obipSV+LKfhhPlnLLm44Zvv/0ul3tSIX/tkd7pcXctOL717+yHt0R+qRlJM6XAM28tzpUGbqzMuWRuKhg4GAAVRPgitDqLHfrbkwLpCY6u4zpV4cugfkYlLKKzPpzIle00Z6hK3pIe2pSBSDN5dh08rg+JNc76fNaTXWzfY86cczbhG+tM8574db9zFb6W81O/d+MUbkCUjDCRF9h3I9oTuEdPyveWwLOKSw2ahNDNJosKdsrxWYxXILMNhDp3AdwzMIx0OEi7puUZZqFTx/u07ME9d2RdDiWVm+At9xSe7JeuGocs1k4z470ITliaf3MKOQuUK3rl7zLr+FCvehg8SldyIqJp4AsmA== -----END CERTIFICATE-----

That output is compared with the userCertificate stored in ldap:

ldapsearch -o ldif-wrap=no -LLL -D cn=directory\ manager -w $PASSWORD -b

ou=certificaterepository,ou=ca,o=ipaca "(subjectName=CN=OCSP Subsystem,O=IPA.TEST)" dn usercertificate subjectname dn: cn=2,ou=certificateRepository,ou=ca,o=ipaca userCertificate;binary:: MIID4TCCAkmgAwIBAgIBAjANBgkqhkiG9w0BAQsFADAzMREwDwYDVQQKDAhJUEEuVEVTVDEeMBwGA1UEAwwVQ2VydGlmaWNhdGUgQXV0aG9yaXR5MB4XDTIyMDEyMTE1NTUxMloXDTI0MDExMTE1NTUxMlowLDERMA8GA1UECgwISVBBLlRFU1QxFzAVBgNVBAMMDk9DU1AgU3Vic3lzdGVtMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzbAfq3nDYgJ04gVZ3P7QbPLC3dLsD4EfcSCyS9I045y1XOE+7pfPZq6+RX8Mj5bdDCjJgYo6MimfkrVdPFZ3QrppPyt9wmKFXX812BYuNPevNfCcXsiuBpqH3PJE3c8YayanMaVTAtAL7SAtN/TMsBOdaRGbZMiem6xnA04ccVXuU5ipIgOQkJu1oBcRQfHhRWmzQqd73Os552kqNZEqAxUsGyyqxiFzu4hW5uQdynko9m8+9ZJphj3dh6saLLQ1qx7u4YxMRquw8kNp7EY9nB+MFUMEU36ptpXxH2WwBZ3CW2lyZ5Tgv94WxUGoB7B72IoKJlvQE8uvDv22HPNssQIDAQABo4GGMIGDMB8GA1UdIwQYMBaAFEupcV7MXSeEeJEAoNSgequZYS09MDoGCCsGAQUFBwEBBC4wLDAqBggrBgEFBQcwAYYeaHR0cDovL2lwYS1jYS5pcGEudGVzdC9jYS9vY3NwMBMGA1UdJQQMMAoGCCsGAQUFBwMJMA8GCSsGAQUFBzABBQQCBQAwDQYJKoZIhvcNAQELBQADggGBADWC/TDB2ZWQ0W7DYI/x3fZKkZCnIrq4KMF6PEbWfZxyUfWRE3EIkgNJaxPmGpZJgFebWQqnmcl8v7Ce5ZF9om0nhdzdcR+hwqWEmTtgYKpTuNIgfahNjqMCm/+LocnJ1cZCdirSPocoOxBHaz96Z6GgWxAj8K7obipSV+LKfhhPlnLLm44Zvv/0ul3tSIX/tkd7pcXctOL717+yHt0R+qRlJM6XAM28tzpUGbqzMuWRuKhg4GAAVRPgitDqLHfrbkwLpCY6u4zpV4cugfkYlLKKzPpzIle00Z6hK3pIe2pSBSDN5dh08rg+JNc76fNaTXWzfY86cczbhG+tM8574db9zFb6W81O/d+MUbkCUjDCRF9h3I9oTuEdPyveWwLOKSw2ahNDNJosKdsrxWYxXILMNhDp3AdwzMIx0OEi7puUZZqFTx/u07ME9d2RdDiWVm+At9xSe7JeuGocs1k4z470ITliaf3MKOQuUK3rl7zLr+FCvehg8SldyIqJp4AsmA== subjectname: CN=OCSP Subsystem,O=IPA.TEST

The userCertificate;binary attribute of the LDAP entry must be the same as the ASCII format minus the header/footer.

Hope this clarifies, flo

On Wed, Mar 2, 2022 at 12:28 AM Andreas Dijkman @.***> wrote:

I've looked in IPA under Authentication -> Certificates and the names (Subject) match according to the output of getcert list on both nodes. Could the organisation-name in the check be a problem ? Ours has a . in the name.

Checking the subjects of the certificates in ou=certificateRepository,ou=ca,o=ipaca in the LDAP-server and the info in getcert list seem to match.

Maybe also tweak the output of line 725 from '{dbdir} does not match entry in LDAP')) to '{dbdir} does not match entry in LDAP by subject')) would be nice, because the message is the same as on line 704, which is confusing.

— Reply to this email directly, view it on GitHub https://github.com/freeipa/freeipa-healthcheck/issues/253#issuecomment-1055966250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCSV36CAJESUK2Z22LGI5TU52RYZANCNFSM5PVJN5NQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

andreasdijkman commented 2 years ago

Yes, that helps a lot! I now know where to look and hopefully can fix the issue.

andreasdijkman commented 2 years ago

I checked them and they all match. The certificate in LDAP and the certificate in the NSSDB. The ldapsearch returns the correct certificate, minus the headers and the word-cut-off on 64 characters.

The only one that doesn't match (exactly) is the Server-Cert cert-pki-ca, I have 8 of those. ldapsearch returns 8 for the query ldapsearch -o ldif-wrap=no -LLL -D cn=directory\ manager -W -b ou=certificaterepository,ou=ca,o=ipaca "(subjectName=CN=<hostname>,O=<orgname>)" dn usercertificate subjectname on both nodes They are all valid and the first is the correct one and matches. The other 7 all have the same subject but are all newer and also valid.

Any pointers?

rcritten commented 2 years ago

That is effectively what the code is doing too except it isn't doing a specific subject search. The current code isn't particularly efficient as it looks like it pulls all certificates in order to do the comparison.

The code is more or less:

entries = <all certificates>
for entry in entries:
     if 'userCertificate' in entry:
         if not cert in entry["userCertificate"] and subject == entry["subjectName"][0]
             yield error

Where cert is the base64 blob from the NSS database.

and userCertificate is a multi-valued LDAP attribute.

rcritten commented 2 years ago

I wouldn't worry about Server-Cert too much. That is a similarly named but different cert per-IPA server.

andreasdijkman commented 2 years ago

So the code needs some fixes I assume? I checked our certificates manually, but the ipa-healthcheck is somehow not comparing the correct certificates with each other?

Can I run some debug-code on our environment, to assist in fixing the check?

flo-renaud commented 2 years ago

Maybe you are hitting this issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066308

Was IdM deployed with a custom subject base? Easy to check, you just need to do:

grep subject_base /var/lib/ipa/sysupgrade/sysupgrade.state

subject_base = O=IPA.TEST

If that's the case (subject base is not O=), it explains the failing checks. ipa-healthcheck is expecting a hardcoded subject name, like CN=OCSP Subsystem,O= instead of CN=OCSP Subsystem,

flo

On Thu, Mar 3, 2022 at 12:05 PM Andreas Dijkman @.***> wrote:

So the code needs some fixes I assume? I checked our certificates manually, but the ipa-healthcheck is somehow not comparing the correct certificates with each other?

Can I run some debug-code on our environment, to assist in fixing the check?

— Reply to this email directly, view it on GitHub https://github.com/freeipa/freeipa-healthcheck/issues/253#issuecomment-1057931259, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCSV332W66ULUVRCSGWVUDU6CMGBANCNFSM5PVJN5NQ . You are receiving this because you commented.Message ID: @.***>

andreasdijkman commented 2 years ago

No, not hitting that bug. We only have 1 O=<org-name> in that line. The Subject-line matches the format CN=OCSP Subsystem,<subject-base>, which we have 2 of now. And we have multiple host-certificates.

Can it be that the entire check fails if only one certificate fails to match, because there are multiple of those? As said before, the certificate itself is also checked against what is stored in the LDAP-server and maybe it compares the wrong certificates, that are possibly superseded by newer certs?

rcritten commented 2 years ago

Multiple certificates for the same cert should not be a problem. Do you have a lot of issued certificates? The LDAP query to retrieve the certificates is not particularly efficient and I wonder if it is hitting a search limit. Also, can you see if there are multiple OCSP certificates in the CA NSS database? certutil -L -d /etc/pki/pki-tomcat/alias

andreasdijkman commented 2 years ago
[root@ipa01 ~]# certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
ocspSigningCert cert-pki-ca                                  u,u,u
Server-Cert cert-pki-ca                                      u,u,u
subsystemCert cert-pki-ca                                    u,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu

And we have 8 host-certificates per IPA-node.

rcritten commented 2 years ago

Ok thanks. Its perfectly legal to have multiple certificates for the same nickname, NSS will pick the "best" one. But it might have confused healthcheck. That isn't the case here. Let's see how many entries are returned from the less-than-ideal search filter currently being used. Run ipa-healthcheck --source ipahealthcheck.ipa.certs --check IPADogtagCertsMatchCheck

Wait 30 seconds (log buffer) and check /var/log/dirsrv/slapd-REALM/access You're looking for something like: [25/Mar/2022:10:31:12.522134281 -0400] conn=228 op=8 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=2 filter="(objectClass=*)" attrs=ALL [25/Mar/2022:10:31:12.522634164 -0400] conn=228 op=8 RESULT err=0 tag=101 nentries=12 wtime=0.000093610 optime=0.000510744 etime=0.000601536 notes=U details="Partially Unindexed Filter" [25/Mar/2022:10:31:12.937836452 -0400] conn=228 op=9 UNBIND The value of nentries is what I'm looking for.

andreasdijkman commented 2 years ago

I'm getting 30 entries.

[25/Mar/2022:15:48:53.918669938 +0100] conn=463929 op=4 SRCH base="ou=certificateRepository,ou=ca,o=ipaca" scope=2 filter="(objectClass=*)" attrs=ALL
[25/Mar/2022:15:48:53.936220494 +0100] conn=463929 op=4 RESULT err=0 tag=101 nentries=30 wtime=0.000201986 optime=0.017558060 etime=0.017752582 notes=U details="Partially Unindexed Filter"
rcritten commented 2 years ago

FWIW the reason you don't see this in 0.7-3 is the check was added in 0.7-5.

At this point I think the fastest way to determine what is going on would be to look at the actual data. If you'd like you can send me the information directly to rcritten at redhat.com

I think I can diagnose this from any of the failed certs, oscp seems as good as any. I'd need:

certutil -L -d /etc/pki/pki-tomcat/alias -n 'ocspSigningCert cert-pki-ca' -a

and

ldapsearch -LLL -x -D 'cn=directory manager' -W -b ou=certificateRepository,ou=ca,o=ipaca usercertificate subjectname

rcritten commented 2 years ago

I dug through the data you provided and I can find no reason for this to fail other than the hardcoded realm as the subject base.

Can you confirm that the subject base reported below matches the O= in your certificates?

# ipa config-show |grep Subject
  Certificate Subject base: O=EXAMPLE.TEST

Incidentally, it isn't comparing the base64 version of the certificates but a binary representation using python-cryptography. Which means our separate subject comparison is superfluous.

andreasdijkman commented 2 years ago

Yes, I can confirm it matches. I compared the output of these 2 commands (no extra spaces at the end):

# ipa config-show |grep Subject
  Certificate Subject base: O=<name>

and

# grep subject_base /var/lib/ipa/sysupgrade/sysupgrade.state
subject_base = O=<name>

And both report the same O=<name>.

Also the subject of the certificate matches:

# certutil -L -d /etc/pki/pki-tomcat/alias -n 'ocspSigningCert cert-pki-ca' -a | openssl x509 -noout -subject
subject=O = <name>, CN = OCSP Subsystem

Maybe the order of the subject in the certificate itself? The difference between O = <name>, CN = OCSP Subsystem and CN = OCSP Subsystem, O = <name>

rcritten commented 2 years ago

Does this match the value of realm in /etc/ipa/default.conf?

andreasdijkman commented 2 years ago

No, not at all.

In /etc/ipa/default.conf we have the actual domain-name (like example.com) for realm in uppercase and domain in lowercase, like this:

realm = EXAMPLE.COM
domain = example.com

The value of O= is the actual name of the organization.

rcritten commented 2 years ago

If I'm understanding you then that explains it then. We incorrectly hardcoded the expected certificate subject to include the realm name, not taking into consideration that the subject base be set by the user at install time.

I'm working on a fix for this in BZ https://bugzilla.redhat.com/show_bug.cgi?id=2066308

I expect to submit a PR for it this week. It will include a much more efficient LDAP filter. The current code pulls ALL certificates from the CA which is horribly inefficient and with enough certs could cause other failures.

I'm also considering how to improve troubleshooting of this, either through DEBUG logs in the code or hints in the error report (or both).