modrzew / pokeminer

Pokemon location scraper
MIT License
220 stars 62 forks source link

ADVICE: My approach to handling worker bans (Ubuntu Server) #266

Open crhbetz opened 7 years ago

crhbetz commented 7 years ago

Hey guys,

since we don't really have a community thing anywhere (or do we?), I'll share my approach to handling account bans for workers on my Ubuntu 14.04 server right here. I'll probably close the issue in a few days to avoid cluttering and just keep it around for reference. Also tell me if there's a better way to post general advice.
Anyways, here we go.

Step 1: Run your worker.py in a way that it'll get restarted if it gets killed.
Example: while true; do python worker.py --log-level=WARNING; done
I'm using while true; do timeout 1h python worker.py --log-level=WARNING; done <-- Timeout 1h will kill the process after one hour, then it'll be restarted from the while loop, because the worker.py process has a tendency to clutter RAM for me after longer run times, which is limited on my rented virtual server.

Step 2: Have a lot more worker accounts in your config.py than you need right now, as a buffer.

Step 3: Use the following bash script from pastebin (because I'm too stupid for markdown code) in your pokeminer directory (no warranty):

http://pastebin.com/cyVjwrGc

This will identify ban messages in the worker.log, delete the banned worker's account from config.py (and store it in banned.txt, for statistics or whatever), and then kill your worker.py process, which will immediately restart from the while loop from step 1 to use the reduced config.py.
The script will just run, waiting for something to do. No status messages or anything. Only thing it'll do in the terminal window is output the ban messages it finds to stdout.
Again, I take no responsibility if any parts of this do harm to your files or systems. I provide it as an idea that serves me well. Because it's killing the worker.py on every ban, it might negatively influence your data when multiple accounts are getting banned in a short timeframe.

and7ey commented 7 years ago

Many thanks for sharing it.

Recently I started to get too many 'ServerSideRequestThrottlingException: Request throttled by server... slow down man' exceptions (SCAN_DELAY = 15). Is there anyway to handle such exceptions as well?

and7ey commented 7 years ago

Here is the OSX approach:

brew install coreutils
while true; do gtimeout 1h python worker.py --log-level=ERROR; done