grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.04k stars 111 forks source link

Flow mode component `otelcol.vsphere.receiver` does use most configured TLS settings #193

Open geoffmore opened 3 months ago

geoffmore commented 3 months ago

What's wrong?

I get an error for an untrusted CA even when setting tls.insecure_skip_verify to true.

I wrote a test in vcenter_test.go TestArguments_UnmarshalRiver to make sure the river config for the TLS block was being unmarshalled correctly in vcenter_test.go. The test adds

tls {
    insecure_skip_verify = true
}

right under the collection_interval line.

The test is

require.Equal(t, true, otelArgs.TLSClientSetting.InsecureSkipVerify)

and returned the expected true and the test passed.

In my testing, I noticed that the insecure option is respected.

Looking at code, it appears that only the insecure option is respected because of the underlying client. Given the limited function arguments relative to the tls block of the otelcol.receiver.vcenter, I believe another client should be used.

Steps to reproduce

  1. Add an argument other than insecure to the tls block of component otelcol.receiver.vcenter
  2. See error logs if the CA/Cert is not trusted

System information

Fedora 6.7.11-200.fc39.x86_64

Software version

Grafana Agent v0.40.3

Configuration

otelcol.receiver.vcenter "default" {
  endpoint = "https://URL"
  username = "Username"
  password = "Password"

  tls {
    insecure_skip_verify = true
    insecure = false // testing true
    ca_file = "/etc/agent/ca.pem"
    cert_file = "/etc/agent/cert.pem"
    server_name = "URL"
  }
  debug_metrics {}

  output {
    metrics = [otelcol.processor.batch.default.input]
  }
}

otelcol.processor.batch "default" {
  output {
    metrics = [otelcol.exporter.otlp.default.input]
  }
}

otelcol.auth.basic "grafana_cloud_tempo" {
    username = "temp-user"
    password = "tempo-password"
}

otelcol.exporter.otlp "default" {
  client {
    endpoint = "tempo-endpoint"
    auth = otelcol.auth.basic.grafana_cloud_tempo.handler
  }
  debug_metrics {}
}

Logs

unable to establish a connection to the vSphere SDK unable to connect to vSphere SDK on listed endpoint: Post \"https://<vsphere_url>/sdk\": tls: failed to verify certificate: x509: certificate signed by unknown authority
hainenber commented 3 months ago

I think the config's logic makes sense here: if insecure is set as false, it'll force the verification and supersedes other TLS options. Might be the doc can be clearer regarding this.

geoffmore commented 3 months ago

Based on the tls block, I would expect the insecure option to disable TLS entirely, which is very different than accepting a cert with an untrusted CA. Having the options present in different places, but unused seems strange to me.

Either way, this feels like an issue that can/should be resolved in the upstream https://github.com/open-telemetry/opentelemetry-collector-contrib/, since that repo has the code that the Grafana agent uses here.

I have ideas for a way to enable all config options, but it will probably be easier to just create an PR to describe what I'm aiming for.

What are your thoughts on being able to leverage all of the options of the tls block?

hainenber commented 3 months ago

I do agree on resolving this on upstream, tbh. We're bound by the exposed interface here.

What are your thoughts on being able to leverage all of the options of the tls block?

I'm all for it :D

rfratto commented 3 months ago

Hi there :wave:

On April 9, 2024, Grafana Labs announced Grafana Alloy, the spirital successor to Grafana Agent and the final form of Grafana Agent flow mode. As a result, Grafana Agent has been deprecated and will only be receiving bug and security fixes until its end-of-life around November 1, 2025.

To make things easier for maintainers, we're in the process of migrating all issues tagged variant/flow to the Grafana Alloy repository to have a single home for tracking issues. This issue is likely something we'll want to address in both Grafana Alloy and Grafana Agent, so just because it's being moved doesn't mean we won't address the issue in Grafana Agent :)

github-actions[bot] commented 2 months ago

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!