fermi-ad / controls

Central repo for reporting bugs, making feature requests, managing RFCs, and requesting seminar topics.
https://www-bd.fnal.gov/controls/
2 stars 0 forks source link

DPM unavailable due to Out of Memory errors #60

Closed finstrom closed 5 months ago

finstrom commented 6 months ago

On Friday, 4/5, at 7:30am and again on Sunday, 4/7, at 8:30pm, all operational DPM daemons threw Java Out of Memory exceptions and stopped functioning. Charlie King is investigating. Stan Johnson and Mike Guzman have directions on how to restart DPM if this should happen again and Controls staff aren't immediately reachable. They were asked to verify that the log file showed out of memory exceptions first and to restart only dce02 - dce08 (leaving dce01 for debugging).