Icinga / icingaweb2-module-x509

Keeps track of certificates as they are deployed in a network environment.
https://icinga.com/docs/x509/latest/
GNU General Public License v2.0
107 stars 24 forks source link

Jobs silently fail to retrieve certificates #210

Open jabbott-hbk opened 1 year ago

jabbott-hbk commented 1 year ago

Describe the bug

I have a job set up to scan a subnet and collect certificates.

There is a host on that subnet with a certificate that contains characters that cannot be written to the DB, so that host's certificate is not captured (if interested, see issue for additional details on the problem with this cert).

When I attempt to upload the problematic cert manually, or even when I create a job to scan just that one host, I see a failure message that the certificate cannot be written to the database. This is what I would expect.

However, if I create a job to scan a subnet of which that host is a part, that job will NOT fail. I would expect for the job to fail when it found the un-writeable certificate. Instead, it continues scanning without ever raising an error. The host's certificate is not written to the DB, but I have no indication that the failure has occurred.

To Reproduce

  1. Give a malformed certificate to some arbitrary host
  2. Try uploading that certificate using "icingacli x509 import". You will see an error
  3. Create a job in the x509 module to scan only the one host/port with the malformed certificate. Run the job. You will see an error here too.
  4. Create a job in the x509 module to scan a subnet that includes the host with the malformed certificate. Run the job. You will not see an error here, even though this job will also fail to capture the certificate.

Expected behavior

I would expect the job to continue scanning other hosts in the subnet and writing their certificates to the DB, but I would expect some stderr output to indicate that an error occurred.

Your Environment

Icinga Web 2 version: 2.11.4 ipl version: 0.11.1 thirdparty version: 0.11.0 x509 version: 1.2.1 x509 db backend: Postgres 12.9 PHP version: 7.3.11 Web browser used: N/A. Commands executed through ssh connection to Icinga master Icinga 2 version used (icinga2 --version): r2.13.6-1 Server operating system and version: CentOS 7

yhabteab commented 1 year ago

Hii @JA-HBK, thanks for reporting.

When I attempt to upload the problematic cert manually, or even when I create a job to scan just that one host, I see a failure message that the certificate cannot be written to the database. This is what I would expect.

Can you also please share the error as well?

However, if I create a job to scan a subnet of which that host is a part, that job will NOT fail. I would expect for the job to fail when it found the un-writeable certificate. Instead, it continues scanning without ever raising an error. The host's certificate is not written to the DB, but I have no indication that the failure has occurred.

How do you run the job? Like using the scan or jobs command?

jabbott-hbk commented 1 year ago

scan-command-error Uploading the manual cert is done with the 'import' command, as shown here.

The job to scan the subnet, which seems to have the silent failure, is done with the 'scan' command. I define the job under the 'jobs' tab for the x509 module in the Icinga web UI, and then use the 'scan' command on the Icinga master.

yhabteab commented 1 year ago

ipl version: 0.10.1

Is this the Icinga PHP library version you are using? x509 version 1.2.1 requires >=0.11.1, I wonder how you could even enable this module. Please upgrade to the latest release and check if the problem still occurs. I cannot reproduce this on my end, but it seems that the Binary behavior from our IPL orm does not transform binary data into hex for the PostgreSQL adapter.

jabbott-hbk commented 1 year ago

Hello @yhabteab My mistake. I copy/pasted the environment information from a previous issue that I posted. I thought those versions were all up to date, but they were not. I've updated the versions appropriately in my original post. My IPL is version 0.11.1, like you recommended

yhabteab commented 1 year ago

Can you please share then the SANs of the corrupted certificate? I would like to know which character is causing this encoding problem. You can find this out e.g. with the following command:

openssl x509 -in yonas.crt -noout -text | grep -A 3 'Subject Alternative Name'

Note You have to adjust the certificate name yonas.crt.

The output should look something like this:

X509v3 Subject Alternative Name: 
    email:yonas@icinga.com, URI:überögabeä@
yhabteab commented 1 year ago

Please also share a screenshot of the System -> About page.

jabbott-hbk commented 1 year ago

icinga-about Hello @yhabteab Here is the about page as requested. My internal secops team requested that I not provide specific hostname SANs for that cert, but I can confirm for you that there are no special characters included in the SAN. It is all alphanumeric values and periods. If you need more information than that, let me know and I'll see what I can do

yhabteab commented 1 year ago

If you need more information than that, let me know and I'll see what I can do

I'm also running out of ideas 😔. How did you create the database? And what is the output of this command: psql -Upostgres -l?

And why is it failing only due to this particular certificate, while the others are working correctly (assuming you're able to import/scan other certificates without issues)?

jabbott-hbk commented 1 year ago

Hey @yhabteab Sorry for the delay in my response.

I created the DB following the instructions in the module here

Listing DBs with that psql command, I can see that it's utf8 encoded with en_US.UTF-8 as both the Collate and Ctype, which seems correct based on the docs.

I am able to scan and import other certificates without a problem. It insists that this one has some kind of non-utf8 character though, which I can't find anywhere in the cert.

However, I think that the problem with this specific certificate is more captured under the report I opened: https://github.com/Icinga/icingaweb2-module-x509/issues/160. I think we've started to muddle this thread with that one. (I'm guilty of this as well)

The main issue I wanted to raise for this thread in particular, is that the job to scan this certificate is failing silently when there are other hosts to scan in the job CIDR, but it fails with an error message when it only scans this one certificate. I think it should raise an error whether or not the job is scanning multiple targets.