FlowFuse / device-agent

An agent to run FlowFuse managed instances of Node-RED on devices
Apache License 2.0
16 stars 8 forks source link

Possible to have 2 instances of NR running at same time #227

Closed hardillb closed 9 months ago

hardillb commented 9 months ago

Current Behavior

If a restart is triggered while the device agent is already restarting a crashed NR instance then it can start a second instance.

Expected Behavior

restart to not start a second instance

Steps To Reproduce

  1. force a crash of NR (or a state where a normal kill doesn't stop the process immediately)
  2. send a restart command to the device agent

Environment

Steve-Mcl commented 9 months ago

@hardillb

Ben, I am exploring the possibility that this might be related to https://github.com/FlowFuse/nr-project-nodes/issues/62 but I am not sure how to recreate:

  • force a crash of NR (or a state where a normal kill doesn't stop the process immediately)

Any pointers on how to achieve this?

  • send a restart command to the device agent

From where? There are (currently no options from FF core to send a restart to a device). Do you mean a systemd / service restart?

hardillb commented 9 months ago
    force a crash of NR (or a state where a normal kill doesn't stop the process immediately)

Any pointers on how to achieve this?

A badly written node that doesn't implement a node.on('close',...) function and prevents Node-RED from shutting down

   send a restart command to the device agent

From where? There are (currently no options from FF core to send a restart to a device). Do you mean a systemd / service restart?

No, tag a new snapshot in the forge app, this will tell the device agent to shutdown the NR instance (with a kill -HUP) which then falls foul of previous comment.

Look at the nr-launcher code, it has a test to see if the kill -HUP failed and then sends a kill -9 to ensure the NR instance is dead.

Any code that restarts Node-RED should wait for it to be properly gone before starting a new NR process.

joepavitt commented 9 months ago

Can I check the status of this - it's got an attached PR that's been merged, so can it be closed? Or is it just needing to be verified on staging?

Steve-Mcl commented 9 months ago

It should have been linked to the PR and therefore would be closed. I'll do it with this comment.