influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.1k stars 5.52k forks source link

(inputs.http_response) Checks failing due to certificates extended key usage attributes been ignored #15567

Open SH30G0RATH opened 2 days ago

SH30G0RATH commented 2 days ago

Relevant telegraf.conf

root@[redacted]-tgf01:/# cat /etc/telegraf/telegraf.d/[redacted]services.conf
   […]
   # Pre-Production URLs with no authentication
   [[inputs.http_response]]
     urls = [
       "https://s3.preprod.[redacted]/minio/health/live"
     ]

     tls_ca = "/etc/telegraf/telegraf.d/[redacted]-ca-preprod.pem"
     follow_redirects = true

Logs from Telegraf

[root@[]redacted _data]# telegraf -test -config test.conf
2024-06-26T15:16:44Z I! Loading config: test.conf
2024-06-26T15:16:44Z I! Starting Telegraf 1.30.1 brought to you by InfluxData the makers of InfluxDB
2024-06-26T15:16:44Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-06-26T15:16:44Z I! Loaded inputs: http_response
2024-06-26T15:16:44Z I! Loaded aggregators:
2024-06-26T15:16:44Z I! Loaded processors:
2024-06-26T15:16:44Z I! Loaded secretstores:
2024-06-26T15:16:44Z W! Outputs are not used in testing mode!
2024-06-26T15:16:44Z I! Tags enabled: host=[Redacted].[Redacted].local
2024-06-26T15:16:44Z D! [agent] Initializing plugins
2024-06-26T15:16:44Z D! [agent] Starting service inputs
2024-06-26T15:16:44Z D! [inputs.http_response] Network error while polling https://s3.preprod.[Redacted]/minio/health/live: Get "https://s3.preprod.[Redacted]/minio/health/live": tls: failed to verify certificate: x509: unhandled critical extension
2024-06-26T15:16:44Z D! [agent] Stopping service inputs
2024-06-26T15:16:44Z D! [agent] Input channel closed
2024-06-26T15:16:44Z D! [agent] Stopped Successfully
> http_response,host=[Redacted].[Redacted].local,method=GET,result=connection_failed,server=https://s3.preprod.[Redacted]/minio/health/live result_code=3i,result_type="connection_failed" 1719415004000000000

Using curl to replicate the test:

   root@[redacted]-tgf01:/# curl -i --cacert /etc/telegraf/telegraf.d/[redacted]-ca-preprod.pem https://s3.preprod.[redacted]/minio/health/live
   HTTP/1.1 200 OK

System info

Telegraf 1.30.1 - Rocky 9.3 - Docker version 26.0.1, build d260a54

Docker

No response

Steps to reproduce

  1. configure inputs.http_response to check website using a specific certificate that contains extended key usage attributes

Expected behavior

expected behaviour is that the checks via inputs.http_response would provide the same response code as when using curl = "200 OK"

Actual behavior

actual behavior is a result_code=3i,result_type="connection_failed"

however there is no problem with the site.

Telegraf documentation suggests that a result_code=3 is a networking failure outside of telegraf, but I can use curl from the same server to show otherwise.

Additional info

we believe the problem is the go library that telegraf uses to perform the request doesn’t support an endpoint that uses a certificate with an extended key usage attribute.

powersj commented 2 days ago

Hi,

we believe the problem is the go library that telegraf uses to perform the request doesn’t support an endpoint that uses a certificate with an extended key usage attribute.

We are using Go's standard library with net/http.Request and then the crypto/tls to parse and setup TLS. You need to identify what field in your certificate is causing the issue, that way you can also go file an upstream issue to get this resolved.

Looking upstream I do see some issues referencing extended key usage:

Is it possible you are setting a directory name like the first issue?

As a workaround if you set insecure_skip_verify = true does that bypass checking this?

In any case, this really needs to be solved upstream and not here.