Hey, my mate said you'd created a script based on mine, which is kinda awesome :) So I thought I'd download and have a play, but the script was broken for two reasons - requests have changed their API a bit, and also when you were string formatting log messages, you were taking unicode strings and trying to put them into ascii strings like so:
new_string = 'This is an ascii string %s' % this_is_a_unicode_string
Which causes an error, so when you do formatting always make sure to prepend u to the first string like so:
new_string = u'This is now a unicode string %s' % this_is_a_unicode_string
So I fixed both of those things, and if you don't mind I'll be having a look at porting some of the features (better scraping, mainly) into my original script. If you haven't checked it out in a while, it's vastly improved (I rewrote with multiprocessing, then twisted, and now I'm using gevent) and more stable, and doesn't choke one's internet connection as much (pages are done sequentially, but downloads are done concurrently).
Hey, my mate said you'd created a script based on mine, which is kinda awesome :) So I thought I'd download and have a play, but the script was broken for two reasons - requests have changed their API a bit, and also when you were string formatting log messages, you were taking unicode strings and trying to put them into ascii strings like so:
Which causes an error, so when you do formatting always make sure to prepend
u
to the first string like so:So I fixed both of those things, and if you don't mind I'll be having a look at porting some of the features (better scraping, mainly) into my original script. If you haven't checked it out in a while, it's vastly improved (I rewrote with multiprocessing, then twisted, and now I'm using gevent) and more stable, and doesn't choke one's internet connection as much (pages are done sequentially, but downloads are done concurrently).
Have a nice day!