We observed sometimes stmgr out of memory and heron-instance out of memory. But the containers were restarted by the scheduler and the we did not have chance to profile the process memory.
Propose a feature to monitor and profile process memory policy on top of dhalion/healthmgr framework.
detector: monitor stmgr/heon-instance memory
diagnoser: if the process memory is too high, trigger resolver
resolver: start process memory profiling for 1 min overwriting the last profile.
This policy keeps a last copy of process memory profile before scheduler restarts the container.
We observed sometimes stmgr out of memory and heron-instance out of memory. But the containers were restarted by the scheduler and the we did not have chance to profile the process memory.
Propose a feature to monitor and profile process memory policy on top of dhalion/healthmgr framework. detector: monitor stmgr/heon-instance memory diagnoser: if the process memory is too high, trigger resolver resolver: start process memory profiling for 1 min overwriting the last profile. This policy keeps a last copy of process memory profile before scheduler restarts the container.
thoughts? @ashvina @avflor @srkukarni @objmagic @maosongfu