TurboTurtle / rig

A lightweight, flexible, easy to use system monitoring and event handling utility
GNU General Public License v2.0
10 stars 7 forks source link

[rfe] Rig 'Running' jobs should be restored after system reboot. #32

Open amtdas opened 4 years ago

amtdas commented 4 years ago

Observation : If the system is rebooted, all the active rig triggers with status as 'Running' are lost. Expectation : All the active rig triggers should be restored once system reboot is done.

[root@rh83-new ~]# rig list
ID     PID    Type    Watching                       Trigger                             Status    
====================================================================================================
qibit  1600  system  system utilization             System loadavg above 6.0            Running   
zphfg  25777  system  system utilization                                                 Running   
fgbuj  25130  logs    messages, journals: system     validate                            Running   
olfpe  24928  logs    dnf.log, journals: system      yum update kernel                   Running   

[root@rh83-new ~]# rig info -i qibit
{
    "id": "qibit",
    "pid": "1600",
    "rig_type": "system",
    "status": "Running",
    "restart_max": 0,
    "restart_count": 0,
    "cmdline": "/usr/bin/rig system --loadavg 6 --kdump",
    "debug": false,
    "watch": "system utilization",
    "trigger": "System loadavg above 6.0",
    "created": "11/23/20 22:41:22",
    "actions": {
        "kdump": {
            "name": "kdump",
            "priority": 10000,
            "expected_result": "A vmcore saved in your configured crash dump location"
        }
    }
}

[root@rh83-new ~]# rig list
ID     PID    Type    Watching                       Trigger                             Status    
====================================================================================================
[root@rh83-new ~]# 

In above scenario, rig id(qibit) triggered kdump once system load avg was 6 and rebooted the system. vmcore is generated as per expectation successfully, but all other active rig triggers are lost. Similarly, whenever system #reboot is executed all the rig jobs are lost.

TurboTurtle commented 4 years ago

This is beyond the scope of rig. We don't run a service of any kind, as rig is an ad-hoc utility. Restarting prior jobs after a boot would require state tracking and an on-boot service to kick off those jobs again.

It is also highlighted in the README for this repo: Rigs do not persist through reboots.

amtdas commented 4 years ago

Ack. Is it a suitable RFE? A mechanism to rig-save all the active triggers whenever new triggers are added.

TurboTurtle commented 4 years ago

Hmm, it is definitely worth exploring. I could see a potential use case where a user does a rig save or somesuch, and then a rig restore at a later date, however I'm having trouble seeing this as a widely used feature.

What could be built off from this however is some sort of template structure - a kind of recipe-style approach where most (all?) of the details are already pre-filled and a user just needs to populate/override specifics.

Let's suss this out a bit more - if it's something that could potentially be used even somewhat frequently then it's worth the investigation. In any event, it is likely to be a long-term RFE.

amtdas commented 4 years ago

Thanks for detailing it. Above comment is something I too have similar understanding about future-feature . I see it as highly useful where rig triggers are scheduled more in numbers. I will mark it rfe. Thank you.