Open mahendravarmayadala93 opened 2 months ago
--continue-power-on-error
flag@rawdaGastan : There is a more recent report from a second farmer (farmID_250), about the same error lines in the logs he obtained.
farmer@bot:~/farmerbot$ tail -n 50 farmerbot.log
2024/11/18 14:08:47 Connecting to wss://tfchain.grid.tf:443...
2:08PM INF starting peer session=farmerbot-rpc-250 twin=826
2:08PM DBG connecting url=wss://tfchain.grid.tf/ws
2024/11/18 14:08:49 Connecting to wss://tfchain.grid.tf/ws...
2:08PM DBG connecting url=wss://relay.grid.tf
2:08PM DBG Add node nodeID=3736
2:08PM DBG failed to read message error="websocket: close 1006 (abnormal closure ): unexpected EOF"
2:08PM DBG connecting url=wss://relay.grid.tf
2:08PM DBG Add node nodeID=4746
2:08PM DBG failed to read message error="websocket: close 1006 (abnormal closure ): unexpected EOF"
All nodes included in this config are currently up in the dashboard, so we can possibly rule out the suspicion of the nodes being unhealthy before being added to the farmerbot. Also, the --continue-power-on-error
is already included in the script that was used to set up.
farm_id: 250 included_nodes:
- You can try to use
--continue-power-on-error
flag
We are advising all farmers to use this flag, but it doesn't seem to help in every case. Aside from the EOF
error above, regular timeouts while trying to reach powered off nodes also seem to block the bot from starting, for example:
error :
9:13PM FTL error="failed to add node with id 2950 with error: failed to get node 2950 statistics from rmb with error: context deadline exceeded"
@rawdaGastan, can you clarify the expected behavior with --continue-power-on-error
?
this flag --continue-power-on-error
allows the farmerbot to continue updating nodes and managing them even some nodes have errors in RMB connection. Otherwise farmerbot won't be able to start if the flag is not set and some nodes have issues with RMB
It is expected that nodes cannot communicate through RMB when they are offline.
this flag
--continue-power-on-error
allows the farmerbot to continue updating nodes and managing them even some nodes have errors in RMB connection. Otherwise farmerbot won't be able to start if the flag is not set and some nodes have issues with RMB
This matches what we expected. The thing then is that we are seeing various cases where the bot does not start due to RMB error, despite the --continue-power-on-error
flag being passed. So that's why I was trying to clarify if there's still some case that should cause the bot to refuse to start due to RMB failures with the flag present.
Assuming no such case exists, our issue is that the bot is still refusing to start with --continue-power-on-error
.
It is expected that nodes cannot communicate through RMB when they are offline.
These errors are coming from online nodes. I'm also not sure what the severity of these errors is. I did some searching regarding EOF error for websockets and found this:
The error indicates that the peer closed the connection without sending a close message. The RFC calls this "abnormal closure", but the error is normal to receive.
But I also found some different suggestions about adjusting timeouts over reverse proxies, etc. So I guess it would also be good to clarify if this EOF error is something that we should be concerned with addressing.
What happened?
Farm ID : 195
The client reported that his Nodes managed by Farmerbot did not shut down.
Upon reviewing the log file, we found that the farmerbot was not starting up due to the following error:
Some additional notes by @scottyeager:
Log File :
farmerbot_16enuun.log
which network/s did you face the problem on?
Main
Twin ID/s
No response
Version
No response
Node ID/s
626, 548, 547(Offline currently) - 3038(Online)
Farm ID/s
195
Contract ID/s
No response
Relevant log output