martomi / chiadog

A watch dog providing a peace in mind that your Chia farm is running smoothly 24/7.
MIT License
457 stars 121 forks source link

Windows: False "Harvester appears offline" notifications when farm is fine #72

Closed suckatlife closed 3 years ago

suckatlife commented 3 years ago

Describe the bug

I'm running the Windows version of chiadog. Every now and then I'll get the "Your harvester appears offline! No events for the past xxxx seconds" notification, and I will continue getting this notification until I restart chiadog. When I check my farm (using chia farm summary) everything is fine. I actually won a block during one of these fake outages.

Looking at timestamps, these outages seem to always coincide with a log rotation, but not every rotation. My log rotates every 40-50 minutes, but this only happens around once a day or so - so it's definitely not with every rotation.

It was suggested I turn off any other processes tailing (using get-content) debug.log, which I've done, but it didn't help.

Environment:

centrd commented 3 years ago

This is a known issue. There is a fix in the works. It is indeed related to log rotation and the way logs are being accessed.

pieterhelsen commented 3 years ago

I wonder if we can write a Windows-specific monitor that hooks into the keep alive monitor thread and then tries to reset the _consume_loop in the log_consumer if there's no activity for 60 seconds...

Or might be even better to write a monitor that monitors the size of the debug.log (on Windows) and resets the _consume_loop if the size goes from X MB to < 1 MB.

What do you think @martomi ?

pieterhelsen commented 3 years ago

Apologies @skrustev ; I should've updated the issue. I've been working on a fix as well, which has been tested successfully by a few people on the Keybase channel. https://github.com/martomi/chiadog/tree/windows-rotating-log (initial version; needs cleaning up)

However, your solution ( #87 ) would be preferable I think, as it's not so invasive. Thoughts, @martomi ?

greimela commented 3 years ago

If #87 fixes the issue and has no side effects it would be really nice!

skrustev commented 3 years ago

Yeah, It does not fix the issue, that is why I closed it. It doesn't happen as often but I still got the false offline state. You can still try it out in the meantime but I don't think its enough.

pieterhelsen commented 3 years ago

OK! Thanks anyway, it would've been a graceful solution :) Feel free to try out my branch instead. I've only implemented it for local File Consumers right now, but will implement the fix for the NetworkLogConsumer tonight.

pieterhelsen commented 3 years ago

I have moved the work on this feature to a new branch: https://github.com/martomi/chiadog/tree/windows-rotation

This ticket relates to #102

tschechniker commented 3 years ago

@pieterhelsen i see the same issue on linux too. It works fine for a while then harvester offline messages pops up. Everytime i check the logs it's totally fine. From what i can see it's related

DrHou83 commented 3 years ago

Hey guys are we waiting on a fix here. Does it help if I switch the log back from INFO to stop the log rotation.

24601 commented 3 years ago

I can confirm this also occurs on Mac OS X (Big Sur)

gilgm12 commented 3 years ago

This fixed up yet? I'm resetting chiadog daily.

pieterhelsen commented 3 years ago

We have a fix ready that uses the Pygtail module to provide a more pythonic way of reading the logfiles and has better handling of log rotations.

martomi commented 3 years ago

Closing this now as the fix is in the new release. Please reopen if issue persists.

runechronos commented 3 years ago

Are you sure ? I exactly have the same error, after the update, keeps telling me harvester is offline ???? Here is my daily recap of yesterday for example, does it seems regular to u ?

โ„น๏ธ Chia DAILY: Hello farmer! ๐Ÿ‘‹ Here's what happened in the last 24 hours:

Received โ˜˜๏ธ: 0.00 XCH Proofs ๐Ÿงพ: None Search ๐Ÿ”:

pieterhelsen commented 3 years ago

Daily stats look OK. Some additional questions for you:

If the problem still persists, please change this line temporarily https://github.com/martomi/chiadog/blob/429b90cf2d3f74a885d4ffcb748fb63b1521e48b/src/chia_log/log_consumer.py#L75

from

for log_line in Pygtail(self._expanded_log_path, read_from_end=True, offset_file=self._offset_path):

to this

for log_line in Pygtail(self._expanded_log_path, read_from_end=True, offset_file=self._offset_path, paranoid=True):

Please report back if this fixes your problem.

gilgm12 commented 3 years ago

After a pull the other day, things seem much better for me - no false 'offline' since.

On Fri, May 28, 2021 at 1:55 AM pieterhelsen @.***> wrote:

Daily stats look OK. Some additional questions for you:

  • How often do you get the message telling you your harvester is offline?
  • Are you monitoring a local harvester or a remote harvester (using SSH)?

If the problem still persists, please change this line temporarily https://github.com/martomi/chiadog/blob/429b90cf2d3f74a885d4ffcb748fb63b1521e48b/src/chia_log/log_consumer.py#L75

from

        for log_line in Pygtail(self._expanded_log_path, read_from_end=True, offset_file=self._offset_path):

to this

        for log_line in Pygtail(self._expanded_log_path, read_from_end=True, offset_file=self._offset_path, paranoid=True):

Please report back if this fixes your problem.

โ€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/martomi/chiadog/issues/72#issuecomment-850227427, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEX77TLN7OSUYJENQQIE73TP5D7ZANCNFSM437ZVWAQ .

runechronos commented 3 years ago

@pieterhelsen I'm monitoring a local harvester. I create my plots with a computer 1, send them to a computer 2 via my local network, and i'm farming and running the full node on computer 2. I get the error mesage every 5 min. So it's a huge spam on pushover app ^^

What will it change to add the "paranoid=true" parameter ? (only removing harvesters notifications maybe ?)

gilgm12 commented 3 years ago

@runechronos - that was exactly how it was for me until I did git pull in master. I think this has been fixed.

-M

On Mon, May 31, 2021 at 7:14 AM runechronos @.***> wrote:

@pieterhelsen https://github.com/pieterhelsen I'm monitoring a local harvester. I create my plots with a computer 1, send them to a computer 2 via my local network, and i'm farming and running the full node on computer

  1. I get the error mesage every 5 min. So it's a huge spam on pushover app ^^

โ€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/martomi/chiadog/issues/72#issuecomment-851483857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEX77VFD5IBAJEYYJQQAB3TQODTBANCNFSM437ZVWAQ .

runechronos commented 3 years ago

@gilgm12 I think you are right, i did a new pull, and it seems to be ok now :)

Just can someone explain to me how the different "search tries are working ?

Search ๐Ÿ”:

Or if you have some link explaining how Chia harvesting is working exactly, would like to be less dumb ^^

Thanks in advance.

gilgm12 commented 3 years ago

Generally you want the scan times to be as close to zero as possible. Mine look like this:

Search ๐Ÿ”:

The majority of my scans are fast, but sometimes when plots are transferring to almost-full disks, the search time is slower.

Chia rewards are sort of like a bingo ticket, and the search times are how fast you yell bingo. Under 5s recommended, over 30 and you lose that opportunity to win.

This is what I've gathered at least - there might be subtleties I'm omitting. I'd also be interested in knowing if I 'missed' a win - this is out of scope for this particular ticket though. Glad it's working for you!

On Thu, Jun 3, 2021, 3:21 AM runechronos @.***> wrote:

@gilgm12 https://github.com/gilgm12 I think you are right, i did a new pull, and it seems to be ok now :)

Just can someone explain to me how the different "search tries are working ?

Search ๐Ÿ”:

  • average: 0.46s in 9500 tries
  • over 5s: 0
  • over 15s: 0 (for example) What has to be achieved to get the block reward (i guess one of your plot has to pass all 3 and maybe more ?)

Or if you have some link explaining how Chia harvesting is working exactly, would like to be less dumb ^^

Thanks in advance.

โ€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/martomi/chiadog/issues/72#issuecomment-853724236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEX77XRBRJF4S2SS7LT5O3TQ5CRNANCNFSM437ZVWAQ .