Closed eachristgr closed 1 month ago
[haozturk@lxplus996 ~]$ k logs webui-rucio-ui-5544c44759-2tnpf
Defaulted container "httpd-error-log" out of: httpd-error-log, rucio-ui
[Thu Jul 18 05:48:33.333140 2024] [mpm_event:error] [pid 7:tid 7] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
[Thu Jul 18 05:48:34.334202 2024] [mpm_event:error] [pid 7:tid 7] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
Thanks, Christos for reporting this. A simple restart should fix this and reset the connections with busy clients.
Kindly let me know if it happens again and we can change or update the server config to accommodate high loads. Can you please check if it works for you too, now?
Just for my record. Example use case for monitoring in : https://github.com/dmwm/CMSRucio/issues/381
Hi @dynamic-entropy, thanks for taking this. The issue seems to be resolved, I can access https://cms-rucio-webui.cern.ch/ without any problem.
It happened again:
$ k logs webui-rucio-ui-f79f5b6db-fj86l
Defaulted container "httpd-error-log" out of: httpd-error-log, rucio-ui
[Mon Jul 29 01:15:41.144028 2024] [mpm_event:error] [pid 7:tid 7] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
[Mon Jul 29 01:15:42.145535 2024] [mpm_event:error] [pid 7:tid 7] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
[Mon Jul 29 01:15:43.145649 2024] [mpm_event:error] [pid 7:tid 7] AH03490: scoreboard is full, not at MaxRequestWorkers.Increase ServerLimit.
https://mattermost.web.cern.ch/cms-o-and-c/pl/d99nw33cwpbwmyodirit7sodsw
We need to revisit the server limits
This was also seen by ATLAS. The solution suggested was to either increase an internal value or scale up the pods. Since we only ran one pod, I moved it to four. We should reopen if we see again.
It seems load related anyhow
Bug Description
Trying to access https://cms-rucio-webui.cern.ch/ returns a time out error.
Checking the logs of the relative pod, it seems like an Apache issue:
In the other hand, https://cms-rucio-webui-int.cern.ch/ seems to work fine
Reproduction Steps
No response
Expected Behavior
No response
Possible Solution
No response
Related Issues
No response