Closed sairamsadanala closed 2 months ago
Pinging code owners:
extension/healthcheck: @jpkrohling
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Hello @sairamsadanala, thanks for filing this issue. As the message states, if you prefer to keep the default endpoint as 0.0.0.0
you can disable the component.UseLocalHostAsDefaultHost
feature gate. Information can be found here on how to do disable feature gates.
For more information on the reasoning and context of this change, changing the default to be localhost
instead of 0.0.0.0
, please refer to this issue.
The best option is to be able to update your configuration to work with an endpoint other than 0.0.0.0
as pointed out in the linked issue, as it's a potential security risk.
Thanks Robert,
I am running otelcol-contrib on AWS ECS and building the image using Docker and pushing to ECR.Can you give me a sample example how to disable default to be localhost in docker build ?
ENTRYPOINT [ "/otelcol-contrib","--config=/etc/otel/config.yaml","--feature-gates=-<WHAT GATE NAME SHOULD I USE HERE>
On Fri, Aug 2, 2024 at 12:50 PM Curtis Robert @.***> wrote:
Hello @sairamsadanala https://github.com/sairamsadanala, thanks for filing this issue. As the message states, if you prefer to keep the default endpoint as 0.0.0.0 you can disable the component.UseLocalHostAsDefaultHost feature gate. Information can be found here https://github.com/open-telemetry/opentelemetry-collector/blob/main/featuregate/README.md on how to do disable feature gates.
For more information on the reasoning and context of this change, changing the default to be localhost instead of 0.0.0.0, please refer to this issue https://github.com/open-telemetry/opentelemetry-collector/issues/8510.
The best option is to be able to update your configuration to work with an endpoint other than 0.0.0.0 as pointed out in the linked issue, as it's a potential security risk.
— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34394#issuecomment-2265871591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKRJ4IZXSZSFFADJIHR6BOTZPPBEXAVCNFSM6AAAAABL46FLY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRVHA3TCNJZGE . You are receiving this because you were mentioned.Message ID: <open-telemetry/opentelemetry-collector-contrib/issues/34394/2265871591@ github.com>
The feature gate name is component.UseLocalHostAsDefaultHost
👍
Is this what you are referring to?
ENTRYPOINT [ "/otelcol-contrib","--config=/etc/otel/config.yaml","--feature-gates=- component.UseLocalHostAsDefaultHost
On Mon, Aug 12, 2024 at 10:18 AM Curtis Robert @.***> wrote:
The feature gate name is component.UseLocalHostAsDefaultHost 👍
— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34394#issuecomment-2284261932, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKRJ4IYS5RC2VQQIOYFROA3ZRDG5HAVCNFSM6AAAAABL46FLY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBUGI3DCOJTGI . You are receiving this because you were mentioned.Message ID: <open-telemetry/opentelemetry-collector-contrib/issues/34394/2284261932@ github.com>
Right, I believe that should work.
The real solution though is to set your health check extension to use 0.0.0.0 (or NodeIP) instead:
health_check:
endpoint: 0.0.0.0:13133
thanks @jpkrohling -- that was the answer that was eluding me
Except: the real solution is that that be the default, as I was using the default which was leading to this error
the real solution is that that be the default
We consciously moved from the default "0.0.0.0" to "localhost".
Having 'localhost' as a default is sensible security-wise.
What I did not expect was that, even if I explicitly specify '0.0.0.0', it's still changed to localhost. I would expect this to only happen if I did not specify anything (hence 'default').
Example config:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
Wouldn't it make more sense to only change the endpoint to localhost in case of:
receivers:
otlp:
protocols:
grpc:
http:
I agree with you, and I just tested on v0.108.0 and it works as expected:
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
Logs:
2024-09-06T15:06:21.023+0200 info otlpreceiver@v0.108.1/otlp.go:153 Starting HTTP server {"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "0.0.0.0:4318"}
Well, that's surprising, my previous test last week didn't seem to work, but it does work without adapting the feature gate now. I must have made a mistake last time.
I was also doing my tests on 0.108.0.
Please ignore my previous message.
I'm closing this issue for now, but please reopen it if we are still missing something.
@jpkrohling I think this work for the OTLP endpoints indeed, but not the healthcheck extension (without disabling the featuregate). See below example tested on 109.0
:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
otlp:
endpoint: "${COLLECTOR_GATEWAY_ENDPOINT}"
tls:
insecure: true
processors:
extensions:
health_check:
endpoint: "0.0.0.0:13133"
service:
extensions: [health_check]
telemetry:
logs:
level: "debug"
pipelines:
metrics:
receivers: [otlp]
exporters: [otlp]
logs:
receivers: [otlp]
exporters: [otlp]
traces:
receivers: [otlp]
exporters: [otlp]
Collector logs:
2024-09-14T08:11:23.913Z info healthcheckextension@v0.106.1/healthcheckextension.go:32 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"localhost:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"CompressionAlgorithms":null,"ReadTimeout":0,"ReadHeaderTimeout":0,"WriteTimeout":0,"IdleTimeout":0,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2024-09-14T08:11:23.914Z info extensions/extensions.go:56 Extension started. {"kind": "extension", "name": "health_check"}
2024-09-14T08:11:23.914Z info zapgrpc/zapgrpc.go:176 [core] [Server #1]Server created {"grpc_log": true}
2024-09-14T08:11:23.914Z info otlpreceiver@v0.106.1/otlp.go:102 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "0.0.0.0:55680"}
2024-09-14T08:11:23.914Z info otlpreceiver@v0.106.1/otlp.go:152 Starting HTTP server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "0.0.0.0:55681"}
2024-09-14T08:11:23.914Z info healthcheck/handler.go:132 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
2024-09-14T08:11:23.914Z info service@v0.106.1/service.go:225 Everything is ready. Begin running and processing data.
2024-09-14T08:11:23.914Z info localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-09-14T08:11:23.914Z info zapgrpc/zapgrpc.go:176 [core] [Server #1 ListenSocket #2]ListenSocket created {"grpc_log": true}
Since the featuregate is planned to be removed in future releases, looking for a more long term solution here. Edit: Raised a new issue in case that behavior is new.
Component(s)
extension/healthcheck
What happened?
Description
We have built an abstraction layer with Otel-coolector-contrib which is intermediate layer where all the otel collector sends telemetry and abstraction layer export to Splunk and Grafana endpoints. Abstraction layer is run on AWS ECS cluster which is load balanced via AWS NLB. This setup is automated using ADO pipeline with CloudFormation template.
Steps to Reproduce
Attached the config and
Expected Result
Up until V0.100.0 our ECS cluster for Abstraction layer run healthy and exports the telemetry to exporter endpoints.
Actual Result
With v0.106.1, AWS NLB target groups health checks are failing on port 13133 and rollbacking the cloudformation teamplate. it is working as expected for v0.100.0.
Collector version
v0.106.1
Environment information
Environment
OS: Amazon Linux
OpenTelemetry Collector configuration
Log output
Additional context
We would like to understand what is this change translate? "localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default. {"feature gate ID": "component.UseLocalHostAsDefaultHost"}"
How do we disable this change from default to use localhost. Any document or steps are highly appreciated.