minimaxir / facebook-page-post-scraper

Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
2.12k stars 663 forks source link

Unicode Encode Error on Public Group Scraping #74

Open Raketemensch23 opened 7 years ago

Raketemensch23 commented 7 years ago

Hello,

I received an Unicode Encode error on running the get_fb_posts_fb_group.py script. I am using Python 3.6.1 :: Anaconda 4.4.0 (64-bit). The .csv file was created and scraped around 19 posts. Full CSV file is available here: https://pastebin.com/BjZY8eEn . Do you know what Unicode characters might have caused this issue? I verified that the group is encoded in UTF-8. The character causing the error appears to be :

Unicode Character 'LATIN SMALL LETTER C WITH CARON' (U+010D)

C:\Users\Mike\Documents\Dad\Lynda.com Python\Machine Learning\facebook-page-post-scraper-master>python get_fb_posts_fb_group.py Scraping 266989076737353 Facebook Group: 2017-07-25 21:48:53.087302

Traceback (most recent call last): File "get_fb_posts_fb_group.py", line 199, in scrapeFacebookPageFeedStatus(group_id, access_token, since_date, until_date) File "get_fb_posts_fb_group.py", line 175, in scrapeFacebookPageFeedStatus w.writerow(status_data + reactions_data + (num_special,)) File "C:\Users\Mike\Anaconda3\envs\py3k\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 34-38: character maps to

umutto commented 7 years ago

It seems to be a problem when writing into the .csv file.

Try to change the with open('{}_facebook_statuses.csv'.format(group_id), 'w') as file: line to with open('{}_facebook_statuses.csv'.format(group_id), 'w', encoding='utf-8') as file:

dsynkov commented 7 years ago

For me it was U0001f44d, or the "thumbs up" sign. @umutto's suggestion seemed to do the trick.

nxy commented 6 years ago

@umutto thanks worked for me