open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.12k stars 2.39k forks source link

[tlscheckreceiver] Implement scraper.go for TLS Check Receiver #35842

Open michael-burt opened 1 month ago

michael-burt commented 1 month ago

Component(s)

receiver/tlscheck

Describe the issue you're reporting

Implement scraper.go for TLS Check Receiver. This should encompass host-based checks for a first pass, we can add checks for certificates stored on disk or in Kubernetes secrets at a later date, as suggested by this comment.

A commenter brought up the issue of how to handle non-https Urls in the targets configuration here. There was also a discussion on whether to use host or url to specify targets here.

Imo we should revert to host from url as currently documented, and disallow the use of a scheme when specifying targets. The reason is that we only need to establish a TCP connection in order to inspect the certificate, we do not need to fire an http request. Since we specify a host when establishing a TCP connection, we should change the name from url to host. Since we are not making an http request, we should disallow the use of a scheme in the target.

michael-burt commented 1 month ago

@breedx-splk how do you feel about reverting to host in the target specification since we only need TCP for this check?

michael-burt commented 1 month ago

@dehaansa does disallowing scheme make sense to you since we are only using TCP to inspect the certificate?

dehaansa commented 1 month ago

@dehaansa does disallowing scheme make sense to you since we are only using TCP to inspect the certificate?

I would personally lean towards accepting scheme, and provide a meaningful error if a non-https scheme is provided. This would allow more users to self-correct when using a non-HTTPS URL on configuration, rather than needing to read logs/see missing metrics at runtime if a non-HTTPS URL was provided with the scheme removed.

michael-burt commented 1 month ago

@dehaansa does disallowing scheme make sense to you since we are only using TCP to inspect the certificate?

I would personally lean towards accepting scheme, and provide a meaningful error if a non-https scheme is provided. This would allow more users to self-correct when using a non-HTTPS URL on configuration, rather than needing to read logs/see missing metrics at runtime if a non-HTTPS URL was provided with the scheme removed.

The issue is that we are not using http at all though, we are using TCP which does not accept any scheme. If we accept scheme, it would make no difference if a user put in foobar://example.com:443 or https://example.com:443 as we would strip the scheme when determining which host and port to dial from the TCP client.

dehaansa commented 1 month ago

@dehaansa does disallowing scheme make sense to you since we are only using TCP to inspect the certificate?

I would personally lean towards accepting scheme, and provide a meaningful error if a non-https scheme is provided. This would allow more users to self-correct when using a non-HTTPS URL on configuration, rather than needing to read logs/see missing metrics at runtime if a non-HTTPS URL was provided with the scheme removed.

The issue is that we are not using http at all though, we are using TCP which does not accept any scheme. If we accept scheme, it would make no difference if a user put in foobar://example.com:443 or https://example.com:443 as we would strip the scheme when determining which host and port to dial from the TCP client.

My intention wouldn't be that scheme would be used in the underlying implementation, but that it would be used in the config validate to confirm that the user at least believed they were providing an HTTPS endpoint.

michael-burt commented 1 month ago

I see, thanks @dehaansa. Are you concerned that users will add a <host>:<port> pointing to a server that does not have TLS enabled? This would result in a TLS error during TCP handshake. We should properly handle this error and communicate it to the user, but I don't think validation of scheme in the config is the best way to do this.

Technically, the scheme should be tcp:// because this is the underlying protocol used to connect to the host. If there were validation on the config which required the use of scheme, I would expect that targets must use tcp://.

What validation would you prefer to see in the config? If a user adds host: http://example.com:80 and the host provides TLS certificates during a TCP handshake on port 80, how would you expect the component to behave?

dehaansa commented 1 month ago

My expectation would be that an http scheme would cause an error during startup/config validation, stating that any URLs provided to the receiver must use an https scheme.

These are just my opinions, I don't have any OTEL standards to point to here so feel free to implement differently this would just be my personal preference. In my experience I find that clear and meaningful configuration time errors are much more effective at increasing users' ability to get up and running without requiring support even if they technically limit the problem space by not supporting some edge cases.

michael-burt commented 1 month ago

Thanks @dehaansa , I appreciate the feedback. I think the main issue I have is this comment:

user at least believed they were providing an HTTPS endpoint.

The user doesn't need to provide an HTTPS endpoint, they could provide an endpoint that uses a different protocol such as GRPC. The TCP handshake is when the certs are exchanged. The could even provide an HTTP endpoint, as long as the server provides TLS certificate during TCP handshake then we are able to verify that server certs are not expired.

github-actions[bot] commented 1 month ago

Pinging code owners for receiver/tlscheck: @atoulme @michael-burt. See Adding Labels via Comments if you do not have permissions to add labels yourself.