threefoldtech / zos

Autonomous operating system
https://threefold.io/host/
Apache License 2.0
83 stars 14 forks source link

Farmerbot - Fujitsu serwer issues #1976

Closed JerzyJasiewicz closed 1 year ago

JerzyJasiewicz commented 1 year ago

Hi,

I have 2 farms with Fujitsu RX600 S6 servers. The first farm has 1 Fujitsu, 3 Dell R820 and 1 Lenovo micro PC (running farmerbot), the second farm has 2 Fujitsu and 1 Lenovo micro PC (with FB). Both farms have problems with Fujitsu - the main problem is that when farmerbot sends the "power down" command, the server does not shut down. It freezes and does not respond to any command (for example, alt+f2). The server never shuts down, I have to restart it manually (reset with power button).

This issue is on both farms. For now i have updated BIOS and iRMC. Motherboard have option to use WoL(authorized by bios) and ACPI (authorized by OS) - i have tired on both options, nothing changed. I have tested on diffirent LANS and routers (2 locations) - still occures.

I use 1 patchcord for zos/iRMC

Farms IDs : 5 and 863

config.md is now configured to not trun off nodes.

IMG_20230611_201359 IMG_20230611_202600 IMG_20230611_185345 IMG_20230611_185506 IMG_20230611_191840 IMG_20230611_191858 IMG_20230611_193046 IMG_20230611_193052 IMG_20230611_200119 IMG_20230611_200334

muhamadazmy commented 1 year ago

@xmonader I don't think this is a farmerbot issue if the farmer bot set the power target to down the node should shutdown. The zos info screen (although not pretty) looks like it was trying to shutdown. but I don't get why it is frozen according to @JerzyJasiewicz

I will look into the node logs and try to figure out what is going on

muhamadazmy commented 1 year ago

I noticed few issues regarding node 863:

It's weird that the node logs ends arbitrary on the 13th of April.

JerzyJasiewicz commented 1 year ago

Nodes are up all the time, i turned them up by power button, additionaly i have set up in config.md to never turn off. In polkadot account there is about 4,8k of tft.

Nodes using USB stick to boot up, i have deployed farmerbot in may if i good remember, but there were issues so (described above) so i have set up "never turn off" on FB and since this time they work normally

JerzyJasiewicz commented 1 year ago

Yesterday i have updated bios and iRMC do i tired to setup farmerbot again but the issue still occure

muhamadazmy commented 1 year ago

Ah sorry, i checked the wrong node id. I will take another look. Seems i need more coffee haha

JerzyJasiewicz commented 1 year ago

Okay :), same issue i have on farm 5, there is one Fujitsu

muhamadazmy commented 1 year ago

Node 896 logs seems to be okay (also node is online and reachable). It's kinda hard to spot the time where it froze trying to shutdown (unless u know exact day to inspect). I am trying to find the logs to find out if it did say anything during the shutdown to explain why it froze.

Same for node 4634.

Do you remember any specific dates when u got this problem ?

JerzyJasiewicz commented 1 year ago

It was yesterday about 19:30 gmt+1

brandonpille commented 1 year ago

I notice you set the never_shutdown to true for all nodes. This means these nodes will never be shutdown...

JerzyJasiewicz commented 1 year ago

Ofc i did because if there wasn't "never shutdown" my nodes were freezing due to this issue. So for now farmerbot is working but there aren't anything to shutdown

JerzyJasiewicz commented 1 year ago

Hello, Any new ideas?

scottyeager commented 1 year ago

Seems same errors seen in https://github.com/threefoldtech/zos/issues/1968

brandonpille commented 1 year ago

@muhamadazmy any progress?

JerzyJasiewicz commented 1 year ago

Nothing changed for now. This issues touches all Fujitsu serwers that i know (6 nodes)

muhamadazmy commented 1 year ago

I still can't think of anything that can cause this except kernel compatibility issue with this hardware. I am not sure if a BIOS update could help but you can try that

xmonader commented 1 year ago

Closing, please reopen if still an issue