Open scottyeager opened 1 year ago
@Omarabdul3ziz have a look at this
the decision to consider poweringOff
as a standby
was intentional to avoid scenarios like where a node goes into standby shortly after being used
if a node fails to power off it may retry or this indicates there might be an issue with it
up
status only applies to valid nodes ready for use. that is what i think. what do you think?
also checked from the farmerbot side. it sets the target to Down
when trying to power off the node, and it is only considered up when only state/target is Up
which we do also on the proxy
and the decision for this was made due to the reason i explained above
When nodes are functioning properly, the current approach and given reasoning make sense, sure. The issue with this approach though is that it masks potential issues with nodes:
From the perspective of someone deploying on the Grid, it doesn't matter much if these nodes are shown as standby or down, as long as they don't get selected for a deployment. But from the farmer's perspective, it makes the node look like it's functioning normally when in fact it is not
Properly functioning nodes will generally spend only a matter of seconds in the poweringOff
state. It's much more likely that someone would try to deploy to the node during the wake up period when the node is shown as up. Since the type of errors I describe above are rare, it's also rather unlikely that someone would try to deploy to a node in the error state.
So to me we are masking important errors in Zos that need to be addressed with a very marginal benefit for users who might try to deploy on these nodes in very rare cases. Furthermore, the deployment interface itself could take care of not allowing users to deploy to nodes with a power target of "Down".
While troubleshooting some issues with nodes responding to power target changes from the farmerbot, I noticed that nodes that are actually online because they failed to respond show as "Standby" in the Dashboard.
The reason is the logic here in
grid-proxy
. SincepoweringOff
is selected for standby.I suggest labeling nodes that haven't yet set their power state to down as "Up".