Open yachang opened 3 years ago
You can see the green line (scamper daemon died), then scamper was brought up, the pid leak started.
After about 11 hours, the pid leak caused the crash of the evrything.
Before we nail down the pid leak in scamper, we will replace the flag
"scamper-daemon-with-scamper-backup"
with
"scamper-daemon"
k8s-support PR following.
The PID count over the same time period as "scamper" was active in the image above.
There was a pid leak detected by mlab3.fln01
https://github.com/m-lab/ops-tracker/issues/1204
After investigation of the log, the pid leak was caused by scamper after scamper-daeon failed: