Open paladini opened 7 years ago
If anyone is having the same issue, I've found how to fix that! Just change the following code from the comments scraper:
def request_until_succeed(url):
req = Request(url)
success = False
while success is False:
try:
response = urlopen(req)
if response.getcode() == 200:
success = True
except Exception as e:
print(e)
time.sleep(5)
print("Error for URL {}: {}".format(url, datetime.datetime.now()))
print("Retrying.")
return response.read()
To this one (i've added .decode('utf-8')
before returning the value):
req = Request(url)
success = False
while success is False:
try:
response = urlopen(req)
if response.getcode() == 200:
success = True
except Exception as e:
print(e)
time.sleep(5)
print("Error for URL {}: {}".format(url, datetime.datetime.now()))
print("Retrying.")
return response.read().decode('utf-8')
Now it's working fine here, but don't know if it's reliable for everyone, so I'm not going to submit a pull request with this fix.
The script does encoding/decoding shenanigans in order to be compatible with both Python 2 and 3. I will have to check if that solution will work for Python 2.
Thanks for the fast reply, @minimaxir !
Guys, again I have an issue with paging. Cannot figure out why it is happening. Can you help me? Thanks! `--------------------------------------------------------------------------- AttributeError Traceback (most recent call last)
@paladini thanks worked for me
I have this issue using comment scraper for public pages. I've filled all variables correctly (app_id, app_secret and page id), have run the post scraper before and it finished successfully.
Following you can see the full error log:
The page I'm scraping has posts and comments written in Brazilian Portuguese (PT-BR).