Closed scottyeager closed 1 year ago
That looks like it is not receiving an answer (in time) from the nodes. The farmerbot sends rmb messages to check the status of the node. If the node doesn't respond it assumes it is off. If the farmerbot assumes a different status of the node it shows an error. For example if it set the target to powering on and the node is still not reacheable after 30 minutes. What happened here is that it assumed the node to be ON but didn't get an answer from it so it returned an error. Actually it received no asnwer from any of the nodes. So I think something is wrong with the rmb-peer here.
Can I get the logs from all containers? docker compose logs > farmerbot.log
I've requested these logs and will report back when the farmer provides them.
Hi @scottyeager. Did you get the logs?
Hi @brandonpille, here are the logs from all containers: https://gist.github.com/scottyeager/357f87e534cf054a676f417e2481cdd6
I noticed that rmbpeer reported reset connection a couple times, but after that it looks normal. I asked the farmer to restart the farmerbot and provide another log file.
And here's the new log after restarting the bot: https://gist.github.com/scottyeager/0eacd9419ae78fe000af73bafaaabbf2
Hi @brandonpille, any update on this?
It looks like rmbpeer is not working properly here. Can I see the output of docker container ls --all?
Can he try to restart the rmb-peer. First find the container name of the peer in the output of docker container ls. It should end with "-rmbpeer-1". Then do:
docker container restart
Hi @brandonpille,
The farmer updated to the latest version of the containers and thus restarted the bot completely in the process. Now he reports that one node (4382) is going to sleep as expected, while the others are not. Here is an updated set of logs from all the containers:
https://gist.github.com/scottyeager/4d8d55a4af84b09593d146005e7f6764
I see a lot of messages: "received reply of an expired message" which means the answers from RMB that the farmerbot requires are not back in time (5 seconds). So that tells me that the issue https://github.com/threefoldtech/farmerbot/issues/30 is getting more and more urgent.
This should be fixed with the new release
Farmer recently set up the farmerbot and has some nodes that should be eligible to shutdown. The nodes aren't shutting down as expected.
Here's the log file: https://gist.github.com/scottyeager/3720726963a78870a4101ff2b447137c
Notable excerpt from the bottom of the file:
Similar messages are repeated through the logs, indicating unsuccessful wakeups for different nodes. According to GraphQL, they are Up/Up:
I tried pinging 4381 and 4382 over RMB and they responded immediately.