Closed kuznas closed 7 months ago
Hi @kuznas
I recommend using triple backticks instead of single ones when pasting large blocks of machine text. With that, the log would look much more readable.
From the last few lines of log it seems that the connection between Central and Sensor was closed.
common/sensor: 2023/07/11 08:07:58.580080 central_communication_impl.go:170: Info: Communication with central ended.
common/sensor: 2023/07/11 08:07:58.580435 sensor.go:298: Info: Terminating central connection.
main: 2023/07/11 08:07:58.580483 main.go:73: Info: Sensor exited normally
It is the current behavior that Sensor shuts down upon closed connection. Next, Kubernetes restarts Sensor and Sensor tries reconnecting to Central again.
Sensor <-> Central connection is gRPC streaming and is persistent, i.e. should ideally stay for as long as clusters exist. In a most typical deployment, Central exposure is configured in such a way that the connection is persistent.
That may be different when custom ingress or proxies are configured between Central and Sensor. In your case, I would suggest to check if nginx
has a timeout configured after which it closes the Sensor <-> Central connection. This timeout is likely the cause of the observed restarts.
The fact that Sensor restarts on connection loss is known and is being addressed. However, it may take a few releases until the behavior is ultimately fixed. Collector pod restarts is a cascading effect of Sensor restarts and should go away when Sensor restarts issue is fixed.
Hope this helps
Hi @msugakov For now i've solved this adding "while true" to sensor pod container. command "kubernetes-sensor" Now, there is no restarts in last 1h.
I don't think adding such hacks is necessary. How do you build and deploy Sensor?
I’ve deployed secured using helm manually but my goal is to deploy it using ArgoCD. Sensor was deployed using script inside zip when adding new cluster
What is the status of this? From the above it seems that the root problem of sensor not being able to communicate with central.
Hello! Using this truck everything works properly. All cluster in internal network with cluster with Central are OK. But now trouble divided to connecting external Clusters with Secured to Central. I’ve described in in Slack Chanel https://cloud-native.slack.com/archives/C01TDE3GK0E/p1691050009654009
Hi, is there any way to upload information from RedHat here? Unfortunately, I don't have a subscription :( https://access.redhat.com/solutions/7026261
hey! the logs seem to indicate that it's failing connecting to Central via gRPC! What load balancer is Central behind? If you're using, say, an Amazon ALB or similar, that do not support gRPC, you'll need to use the WebSocket protocol instead. when deploying your sensor, you can specify this by adding the wss://
schema on your central endpoint, i.e.: wss://$STACKROX_CENTRAL_ENDPOINT:443
. Hope this fixed your issue!
Closing inactive issue, please feel free to reopen when you have more updates.
Hello everyone. i've deployed Central on k8s cluster A and Secured+Sensor on cluster B. Everything works, but Sensor pod on cluster B always fails, which lead to collector pods fail too. After 3-5 minutes doing absolutely nothing its running again.
Logs of sensor pod:
This is a cycle of fails and rises
sensor Completed - CrashLoopBackOff - Running Collector - CrashLoopBackOff - Running
LoadBalancer - ingress nginx exposing Central.
Why pod lost connection?
Can anyone please help me to solve this issue?