martomi / chiadog

A watch dog providing a peace in mind that your Chia farm is running smoothly 24/7.
MIT License
457 stars 120 forks source link

"tail -F" sometime fail to follow new log file #134

Closed bhy closed 3 years ago

bhy commented 3 years ago

On Linux it looks like tail -F may not reliably follow new log files. When this happens, chiadog will report "Your harvester appears to be offline", and inspecting the /proc file system shows that the tail process is still opening the old log file, eg. .chia/mainnet/log/debug.log.11.

Environment:

pieterhelsen commented 3 years ago

Thanks for this insight! It looks like that's definitive proof that we need to move away from using tail and Windows' Get-Content.

This seems like it could provide a cross-platform solution for LOCAL harvesters https://stackoverflow.com/a/43547769

For remote harvesting, however, this will not work. One possible alternative is Paramiko's SFTPClient.open method, but this is slow (as per this discussion) and probably won't handle log rotations well.

bhy commented 3 years ago

Meanwhile I think the following tail option might help:

       --max-unchanged-stats=N
              with --follow=name, reopen a FILE which has not

              changed size after N (default 5) iterations to see if it has been unlinked or renamed (this  is  the  usual
              case of rotated log files); with inotify, this option is rarely useful
pieterhelsen commented 3 years ago

It would likely help, but it's still not very pythonic and does not work for cross-platform implementation. I plan to spend some time on fixing this in the next couple of days.

Gr33nDrag0n69 commented 3 years ago

This bug is what prevent me to use this software for production monitoring. Overall, very good job. Concerning the issue, I have a couple ideas on how it ccould be at least be dirty fixed before making better code for the long run. The simpliest drity fix for now could be:

When 'Your harvester appears to be offline! No events for the past X seconds.' is detected, the code could just try to reload log consumer first before sending warning.

Gr33nDrag0n69 commented 3 years ago

Also, on windows, the current never work at all. (As soon as the logrotate code from chia logging module is executed, the harvester offline warning appear and the only is to kill and restart chiadog)

ghost commented 3 years ago

I can second this. Using chiadog in WSL for monitoring my windows plotter and farmer. My files split roughly every 22minutes and I am getting spammed with "Your harvester appears to be offline" messages.

pieterhelsen commented 3 years ago

There's a branch currently in Pull Request that will likely solve this issue.

164 closes this issue

ghost commented 3 years ago

Thank you, Pull request should definitely save this. Looking forward to the merge.

Gr33nDrag0n69 commented 3 years ago

Thank you! I see the code was merged in dev branch, when can we expect an official release with this upgrade implemented?

martomi commented 3 years ago

Official release is out. This is hopefully resolved, please reopen issue if not.