StorjOld / downstream-farmer

Client software for a Storj farmer.
http://driveshare.org
MIT License
35 stars 16 forks source link

Farmer exits when node is temporarily unavailable #25

Open neatbasis opened 9 years ago

neatbasis commented 9 years ago

Farmer should be more resilient in the event there are connectivity issues.

This is what happens when the node is temporarily unavailable. (Bad connectivity, solar flares etc.)

Challenge update failed: Unable to perform HTTP get.
Dropping contract 96a1b0e4558caef6579...

Farmer process terminates

neatbasis commented 9 years ago

This might help: requests.adapters.DEFAULT_RETRIES = 5 as is suggested here: http://stackoverflow.com/questions/15431044/can-i-set-max-retries-for-requests-request this method might be deprecated in favor of this:

from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://stackoverflow.com', HTTPAdapter(max_retries=5))
EmergentBehavior commented 9 years ago

The relevant code is here: https://github.com/Storj/downstream-farmer/blob/master/downstream_farmer/contract.py#L65-L68

Want to submit a pull request? :)

neatbasis commented 9 years ago

Perhaps once I'm finished with playing with the source and testing. Tried the fix I suggested, but it's not resilient enough yet.

super3 commented 9 years ago

There is currently a keep alive argument --keepalive for the command line arguments. @wiggzz Can you add some insight on what logic that adds?

wiggzz commented 9 years ago

basically it catches any exceptions inside the main loop and if --keepalive is specified, it will attempt to reconnect after 10 seconds. it won't retry if it fails on the first try though since it checks the URL of the node to ensure it has connectivity before entering the main loop. this probably isn't all that robust yet, but it was sort of a quick and dirty way to get the farmer to stay alive if it had connectivity issues temporarily.

heunland commented 9 years ago

I think I have a similar issue with connectivity. I wonder if the error message @neatbasis got is the same as mine:

Challenge update failed: Unable to perform HTTP get. Dropping contract dc45022d23518fbc5ad87dc3dd27328b12abb63f8dbd75be4e8e20e2d3e36de1 Total size: 0, Desired Size: 100 100 bytes remaining Unexpected error: ('Connection aborted.', error(110, 'Connection timed out'))

neatbasis commented 9 years ago

Yeah. Same thing

EmergentBehavior commented 9 years ago

Did you guys try --keepalive? Also, there are some updates to downstream-farmer coming that will allow it to answer challenges in bulk (as opposed to one by one) and not immediately drop contracts if it crashes.

heunland commented 9 years ago

Yes, last night I started the farmer with --keepalive, so far, so good. When you have the new update for downstream-farmer, do we need to download a new file?

super3 commented 9 years ago

@heunland Working on it. Will update you when we have new binaries for you to download.

neatbasis commented 9 years ago

Have been running farmer with --keepalive for almost a week now. No hiccups for now. (ds 1.5 --size 30720)