Icinga / icingaweb2-module-x509

Keeps track of certificates as they are deployed in a network environment.
https://icinga.com/docs/x509/latest/
GNU General Public License v2.0
105 stars 23 forks source link

Report expired status in CLI but not module interface #232

Open sdaru opened 5 months ago

sdaru commented 5 months ago

Describe the bug

Certificates show has expired on icingacli/icingaweb but not on module interface

To Reproduce

I think it was working fine before i update that package (before i had version 1.0.0). But now after drop and recreate db and use last version of module, Even if the certification is showing correct information en Web interface (Certificate chain is valid. / in XX days), the cli and icingaweb (since it uses the cli), shows error:

icingacli x509 check host --host 'myhost.domain.org' CRITICAL - myhost.domain.org has expired since 240 days|'myhost.domain.org''=0s;1209600:;604800:;0;360353

From webpage of tht host that host work well.

i have same issue with different host (via cli)

Expected behavior

Show current status and not trigger warning/error status.

Your Environment

Include as many relevant details about the environment you experienced the problem in

Can someone help me ?

Originally posted by @sdaru in https://github.com/Icinga/icingaweb2-module-x509/issues/211#issuecomment-1883337673

yhabteab commented 5 months ago

Hi @sdaru, thanks for reporting!

Can you please provide a screenshot of that chain's detail from Icinga Web? Perhaps the issuer of your myhost.domain.org or one of its intermediate CAs has expired.

sdaru commented 5 months ago

That CA it's internal and we renew and import into it and it's all valid i just double check before start first scan.

Kind Regards.

yhabteab commented 5 months ago

That CA it's internal and we renew

If you don't want to expose CA details publicly, you can blur all the sensible information. I just want to see the expiration date of all certificates within that chain.

slalomsk8er commented 5 months ago

blur all the sensible information

Blur isn't very destructive. Just put a block over it.

sdaru commented 5 months ago

Hi all, i'm back

Here you are those screen shoot

Host

certificate1

Subca01 certificate3 certificate2

RootCA certificate4 certificate5

Waiting some info or news.

Kind Regards

sdaru commented 5 months ago

No news guys ?

slalomsk8er commented 2 months ago

I have a similar problem.

'/usr/bin/icingacli' 'x509' 'check' 'host'
'--critical' '3d' '--host' 'example01.example.com' '--ip' '10.5.72.218' '--port' '8443' '--warning' '7d'
CRITICAL - mirth-connect: unable to get local issuer certificate; mirth-connect has expired since 19829 days|'mirth-conn
ect'=0s;604800:;259200:;0;0

image image

yhabteab commented 2 months ago

Are you guys all using MySQL/MariaDB? Actually, there shouldn't be any discrepancy when using PostgreSQL, but the former ones seem to do some unexpected things. The x509 check commands determines certificate from/to date in a given chain using a min/max aggregates... https://github.com/Icinga/icingaweb2-module-x509/blob/8425ede0f4892e9c2c7c3ee58116cf474ef84f76/application/clicommands/CheckCommand.php#L92-L105

... which seem to just work fine with PostgreSQL in any case, but when using MySQL/MariaDB this produces unexpected results if the certificate is not yet part of a chain.

PostgreSQL:

postgres=# SELECT MAX(GREATEST(3185943150, NULL)) AS max;
    max     
------------
 3185943150
(1 row)

postgres=# SELECT MIN(LEAST(3185943150, NULL)) AS min;
    min     
------------
 3185943150
(1 row)

MySQL/MariaDB:

MariaDB [x509]> SELECT MAX(GREATEST(3185943150, NULL)) AS max;
+------+
| max  |
+------+
| NULL |
+------+
1 row in set (0.000 sec)

MariaDB [x509]> SELECT MIN(LEAST(3185943150, NULL)) AS min;
+------+
| min  |
+------+
| NULL |
+------+
1 row in set (0.000 sec)

As you can see from the output above, MariaDB returns a null timestamp if the certificate is not part of a chain, which is the case in your scenario. When calculating the from - to timestamp of the certificate from @slalomsk8er's screenshot above, the output is the same as in the UI.

root@506429f94d6f:/# date -d '2070-12-16 08:12:30' +%s
3185943150
root@506429f94d6f:/# php -a
Interactive shell

php > $to = new DateTime();
php > $to->setTimestamp(3185943150);
php > $now = new DateTime();
php > $reminder = $to->getTimestamp() - $now->getTimestamp();
php > var_dump(($reminder - $reminder % 86400) / 86400);
int(17044)
slalomsk8er commented 2 months ago

Are you guys all using MySQL/MariaDB?

Worse, it's a MariaDB Galera Cluster.

slalomsk8er commented 2 months ago

Maybe related?

'/usr/bin/icingacli' 'x509' 'check' 'host' '--critica
l' '3d' '--host' 'example02.example.com' '--ip' '10.5.77.46' '--port' '443' '--warning' '7d'
CRITICAL - example02.example.com has expired since 67 days|'example02.example.com'=0s;604800:;259200:;0;1555973

image image

yhabteab commented 2 months ago

... which seem to just work fine with PostgreSQL in any case, but when using MySQL/MariaDB this produces unexpected results if the certificate is not yet part of a chain.

After rechecking the generated query, I've to correct myself with this statement, in fact it also affects PostgreSQL. The subqueries for valid_from and valid_to perform an inner join on the issuer, which obviously does not yet exist, as it states in invalid reason unable to get local issuer certificate and PHP does some magic things with null values.

In case you need a fix right now, you can use this patch, which should solve the problem.

diff --git a/application/clicommands/CheckCommand.php b/application/clicommands/CheckCommand.php
index 0c369d9..cd6e363 100644
--- a/application/clicommands/CheckCommand.php
+++ b/application/clicommands/CheckCommand.php
@@ -108,8 +108,12 @@ class CheckCommand extends Command
         list($validToSelect, $_) = $validTo->dump();
         $targets
             ->withColumns([
-                'valid_from' => new Expression($validFromSelect),
-                'valid_to'   => new Expression($validToSelect)
+                'valid_from' => new Expression(
+                    sprintf('COALESCE((%s), target_chain_certificate.valid_from)', $validFromSelect)
+                ),
+                'valid_to'   => new Expression(
+                    sprintf('COALESCE((%s), target_chain_certificate.valid_to)', $validToSelect)
+                )
             ])
             ->getSelectBase()
             ->where(new Expression('target_chain_link.order = 0'));

Maybe related?

'/usr/bin/icingacli' 'x509' 'check' 'host' '--critica
l' '3d' '--host' 'example02.example.com' '--ip' '10.5.77.46' '--port' '443' '--warning' '7d'
CRITICAL - example02.example.com has expired since 67 days|'example02.example.com'=0s;604800:;259200:;0;1555973

image image

That's actually by design and works as it should! Like I said before, the check command determines the greatest valid_from and the least valid_to timestamp of the certificates from the entire chain and in your case even the actual certificate hasn't expired, but its issuer and all intermediate up to the root CA have already expired 67 days ago. Thus, the check command classifies it as expired, as a certificate with an expired CA is not valid anyway.

slalomsk8er commented 2 months ago

@yhabteab Thank you, I will try the patch immediately.

I disagree, as in the first picture shown, the chain is valid and on the second pictures it shows also old certificates and to me it looks like somehow the check picks them up in contrast to the web interface, that shows a valid chain as does Chrome. So also, per title of this issue, it's a diff between the cli and the module interfase but probably a different cause.

yhabteab commented 2 months ago

I disagree, as in the first picture shown, the chain is valid and on the second pictures it shows also old certificates and to me it looks like somehow the check picks them up in contrast to the web interface, that shows a valid chain as does Chrome.

Well, initially I didn't notice that the certificate subject is the same for all the certificates listed in the UI, but it seems like you have some outdated data in your database.

So also, per title of this issue, it's a diff between the cli and the module interfase but probably a different cause.

Can you please try to cleanup your database as described in here before opening a new issue.

slalomsk8er commented 2 months ago

Not sure if a clean up is a good idea before we understand the problem. My suspicion is that, because of the self signing, it doesn't replace old ones but chains them and the module interface and the CLI handle them differently.

Your patch removed a bunch but I still have at least one similar case.

'/usr/bin/icingacli' 'x509' 'check' 'host' '--critical' '3d' '--host' 'example03.example.com' '--ip' '10.5.69.82' '--port' '8443' '--warning'
 '7d'
CRITICAL - example03.example.com: self signed certificate; example03.example.com has expired since 19 days|'example03.example.com'=0s;604800:;259200:;0;2741587

image

example03.example.com
Zertifikatinhaber
CN  example03.example.com
Aussteller Name
CN  example03.example.com
Zertifikats-Info
Seriennummer    18c4791aaa45647b
Version 3
Signatur Algorithmus    RSA with SHA256
Nicht gültig vor    Tuesday October 18th, 2022 16:06:58 Europe/Paris
Nicht gültig nach   Sunday October 17th, 2027 16:06:58 Europe/Paris
Informationen des öffentlichen Schlüssels
Algorithmus RSA
Schlüssel-Größe 2048
Erweiterungen
Subject Alt Name    IP Address:10.5.69.82, DNS:example03.example.com, IP Address:0:0:0:0:0:0:0:1 , DNS:example03.example.com, DNS:example03, DNS:example03.local
Fingerabdrücke
SHA-256 EB 1C 21 72 BA 6C 61 5E 8A F3 61 0C EF 6F 54 03 8C BB B6 51 46 36 9D B6 5E FE A6 F3 03 8A FA 48
yhabteab commented 1 month ago

Not sure if a clean up is a good idea before we understand the problem.

If you remove some valid hosts with the cleanup command, they will show up again with the next scan anyway, so I don't see any harm in that. Since I can't reproduce this on my end, I have no idea why the expiry date is wrong, but to be honest it's an indicator of stale data in the database.

$ openssl x509 -in self-signed-one.crt -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            3d:8e:47:b6:e1:a3:8a:8b:df:63:c8:23:d8:88:ed:1b:26:a4:39:57
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=self-signed-one
        Validity
            Not Before: May  8 16:00:07 2024 GMT
            Not After : Jun  1 16:00:07 2024 GMT
        Subject: CN=self-signed-one
–––––––

$ icingacli x509 check --host 'self-signed-one'
CRITICAL - self-signed-one: self-signed certificate; self-signed-one expires in 23 days|'self-signed-one'=2072658s;518400:;172800:;0;2073600
Bildschirmfoto 2024-05-08 um 18 15 29
yhabteab commented 1 month ago

Actually, I was able to reproduce this after some time 🙈.

Bildschirmfoto 2024-05-13 um 12 08 32
$ icingacli x509 check --host 'self-signed-one'
CRITICAL - self-signed-one: self-signed certificate; self-signed-one expires in 59 days|'self-signed-one'=5179266s;1296000:;518400:;0;5181933
MariaDB [x509]> SELECT COUNT(*) FROM x509_certificate where subject = 'self-signed-one';
+----------+
| COUNT(*) |
+----------+
|        2 |
+----------+
1 row in set (0.000 sec)

With https://github.com/Icinga/icingaweb2-module-x509/pull/240:

$ icingacli x509 check --host 'self-signed-one'
CRITICAL - self-signed-one: self-signed certificate; self-signed-one expires in 99 days|'self-signed-one'=8635468s;2160000:;864000:;0;8635598

Edit:

Though as I said before, a cleanup command would have fixed that too, since you're removing unused certificates with the same subject name that would otherwise cause conflicts.

slalomsk8er commented 1 month ago

@yhabteab I still get the diff between cli and web gui for:

image

CRITICAL - example02.example.com has expired since 67 days|'example02.example.com'=0s;604800:;259200:;0;1555973

also '/usr/bin/icingacli' x509 cleanup --since-last-scan="1 months" and '/usr/bin/icingacli' 'x509' verify didn't help

yhabteab commented 1 month ago

@yhabteab I still get the diff between cli and web gui for:

CRITICAL - example02.example.com has expired since 67 days|'example02.example.com'=0s;604800:;259200:;0;1555973

Nope, that's the expected result! Your example02.example.com certificate (not CA) may not have expired yet, but its issuer has already expired. See https://github.com/Icinga/icingaweb2-module-x509/issues/64 for details.

slalomsk8er commented 1 month ago

@yhabteab problem is, those self signed certs aren't the issuer certificates, still show up after cleanup and the diff between the web GUI and CLI it annoying and hampers customer trust in our setup.

image

Maybe I should try to schedule a call with you via the Linuxfabrik?

yhabteab commented 1 month ago

Can you please provide me the output of the following query, we can then coordinate a call if that does not help me identify the cause. (Please start MariaDB with the --binary-as-hex option).

SELECT *
FROM x509_certificate
       INNER JOIN x509.x509_certificate_chain_link link on x509_certificate.id = link.certificate_id
       INNER JOIN x509.x509_certificate_chain chain on link.certificate_chain_id = chain.id
       INNER JOIN x509.x509_target x509t on chain.target_id = x509t.id
WHERE chain.id = CHAIN_ID_OF_YOUR_CORRUPT_CERTIFICATE;
slalomsk8er commented 1 month ago

@yhabteab done, were to send it privately?

yhabteab commented 1 month ago

@yhabteab done, were to send it privately?

Please upload it here.

yhabteab commented 1 month ago

@yhabteab done, were to send it privately?

The certificate chain you provided me is totally messed up, and I don't even know why it is considered to be a valid chain, as the actual issuer of that certificate is not part of that chain. Also, can you upload the output of openssl s_client -showcerts -connect 10.5.77.46:443 in the above link. For me it looks like the web server is providing way too many certificates for the same host/server. Are you also using the latest version of the x509 module?

I have also tried to recreate your situation by providing a number of meaningless certificates in my nginx cert file, and was able to kind of reproduce it.

Bildschirmfoto 2024-05-14 um 16 17 20

The first certificate expires in about 400 days and the second one in 300 days and the check command even with #240 outputs OK - self-signed-one expires in 299 days|'self-signed-one'=25917068s;6480000:;2592000:;0;25917144

But I suppose the check command should only consider certificates with certificate_chain_link.order = 0 and completely ignore the remaining ones. With the updated version of #240, it at least delivers the expected result, I guess.

$ icingacli x509 check --host 'self-signed-one'
OK - self-signed-one expires in 399 days|'self-signed-one'=34558908s;8640000:;3456000:;0;34560000
slalomsk8er commented 1 month ago

I know, the chain is a complete mess. I'm on the current version including some of the patches in https://github.com/Icinga/icingaweb2-module-x509/pull/240. The self signed certs were created first and the cleanup should remove them as the shouldn't be seen for quite the while. Maybe because they are in the chain they aren't removed?

yhabteab commented 1 month ago

The self signed certs were created first and the cleanup should remove them as the shouldn't be seen for quite the while.

That's the confusing part, the orphaned certificates are still part of the new chain with the new certificate, which makes me think the web server is doing some fancy stuff and that's the reason I'm asking for this:

Also, can you upload the output of openssl s_client -showcerts -connect 10.5.77.46:443 in the above link. For me it looks like the web server is providing way too many certificates for the same host/server.

Maybe because they are in the chain they aren't removed?

Usually, when a host receives a new certificate, a new certificate chain is also created, and when you subsequently run the cleanup command, the old chain of that host and its relationships gets removed as it is no longer used by a target host. But your chain is still used by a host target.

slalomsk8er commented 1 month ago

It looks like the the useless line SSLCertificateChainFile /etc/.../fullchain.pem in the apache config was to blame. I also send fullchain.pem to you for testing. Now I only see the valid certificate and the web gui and the check are in sync. One more question: Why aren't the intermediate and the root CA, both also in the DB, not shown in the chain details?