open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.88k stars 2.26k forks source link

supervisor does no retry to connect to opamp server forever #33408

Closed cforce closed 1 month ago

cforce commented 3 months ago

Component(s)

cmd/opampsupervisor

What happened?

Supervisor when started shall not give up to connect to the opamp backend, when errors with connectivity. At least we shall be able to configure the timeout before giving up. In term of resilience in a non stable (e.g. cellular) network environment this would need elsewise a external scheduler like systemd to restart instead of retries of the supervisor itself. such external restart would also increase load on the cpu. A "endless" loop with retry timeout is the best practice for client to sever communication retrs.

Despite the errors, the log indicates that there are retries happening (e.g., will retry message). However, if it seems like it's not retrying, it might be due to:

Immediate Failures: The connection attempts might be failing too quickly in succession, making it appear as if there's no retry mechanism. There might be configuration settings limiting or controlling the retry behavior which i don't know. Why does the supervisor's has such fixed (instead of unlimited) retry policies or limits? I feel the supervisor code is written like that to handle error situation, but it shall retry resilient

Collector version

o.101

Environment information

No response

OpenTelemetry Collector configuration

No response

Log output

2024-06-05T08:03:37.098+0200    DEBUG   commander/commander.go:74       Starting agent  {"agent": "./otelcollector"}
2024-06-05T08:03:37.100+0200    DEBUG   commander/commander.go:93       Agent process started   {"pid": 196962}
2024-06-05T08:03:37.262+0200    DEBUG   commander/commander.go:160      Stopping agent process  {"pid": 196962}
2024-06-05T08:03:37.267+0200    DEBUG   supervisor/logger.go:21 Agent disconnected: websocket: close 1000 (normal): Normal closure
2024-06-05T08:03:37.272+0200    DEBUG   commander/commander.go:176      Agent process successfully stopped.     {"pid": 196962}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:151    Supervisor starting     {"id": "01HZKFXTR406AFVGQT5ZYC0GEK"}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:369    Connecting to OpAMP server...   {"endpoint": "ws://xxx:XX/v1/opamp", "headers": {"Agent-ID":[""],"Authorization":["Secret-Key XXXXXXXXXXXXXXXXXXXXXXXXX"]}}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:419    Starting OpAMP client...
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:426    OpAMP Client started.
2024-06-05T08:03:37.274+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.274+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.274+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:47.273+0200    ERROR   opampsupervisor/main.go:24      failed to connect to the OpAMP server: %!w(<nil>)
main.main
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/main.go:24
runtime.main
        /usr/local/go/src/runtime/proc.go:267

Additional context

No response

github-actions[bot] commented 3 months ago

Pinging code owners:

JaredTan95 commented 3 months ago

You mean when the connection to oapserver fails supervisor should exit the process?

cforce commented 3 months ago

The opposite- it shall never exi but retry to re/connect forever

tigrannajaryan commented 2 months ago

The opposite- it shall never exi but retry to re/connect forever

+1. This is the intent.

cforce commented 2 months ago

Will that MR solve it? Seems like @tigrannajaryan review in on your side ;)

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33275