minimaxir / facebook-page-post-scraper

Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
2.12k stars 663 forks source link

No 'from' key returned for comments #88

Open SCooled opened 6 years ago

SCooled commented 6 years ago

Not sure whether this is an issue with script or whether this just requires specific permissions. Initially had an issue when running get_fb_comments_from_fb.py :

KeyError: 'from'

Pasting the full URL into browser shows the same thing; all other fields/keys return data - cannot seem to get data for 'from'.

Fb documentation still includes 'from' as a field/endpoint - but I was wondering whether a) change or update to Graph API meant this field is not supported (has anybody else seen this?), b) my permissions do not allow access (although I've not read anything on this requirement) c) use of script is wrong, d) other

thought it could be along same lines as https://github.com/minimaxir/facebook-page-post-scraper/pull/71 but doesnt seem to be.

Note; I can get the scraper to work when I hashout the author elements/from parts - so all other parts of the script work fine

Any help would be greatly appreciated - Thanks

MiguelTBotelho commented 6 years ago

Same here, Have you already solved this issue?

codenamenull commented 6 years ago

i got that problem too, has anyone solved this?

haseebmahmud commented 6 years ago

I am having the same,

$python3 get_fb_comments_from_fb.py 
Scraping asdfg Comments From Posts: 2018-02-08 11:02:58.329331

Traceback (most recent call last):
  File "get_fb_comments_from_fb.py", line 233, in <module>
    scrapeFacebookPageFeedComments(file_id, access_token)
  File "get_fb_comments_from_fb.py", line 161, in scrapeFacebookPageFeedComments
    comment, status['status_id'])
  File "get_fb_comments_from_fb.py", line 94, in processFacebookComment
    comment_author = unicode_decode(comment['from']['name'])
KeyError: 'from'
leduyloc commented 6 years ago

me too, could anyone teach us how to solve this problem?

marymary22 commented 6 years ago

i removed the ['from'] and it worked for me

jimenezfer commented 6 years ago

I am having the same issue, removing [from] doesnt work either.

rlorenz123 commented 6 years ago

Has anyone figured this out? Is this due to API changes and privacy issues? I tried to remove the 'from' but then get the following error message:


IndexError Traceback (most recent call last)

in () 227 228 if __name__ == '__main__': --> 229 scrapeFacebookPageFeedComments(file_id, access_token) 230 231 in scrapeFacebookPageFeedComments(page_id, access_token) 159 160 # calculate thankful/pride through algebra --> 161 num_special = comment_data[6] - sum(reactions_data) 162 w.writerow(comment_data + reactions_data + 163 (num_special, )) IndexError: tuple index out of range
AbhishekBabuji commented 6 years ago

Any fixes yet? I'm running into the same issue!

lvsanalytics commented 6 years ago

The error means that it is failing to pull the ['from']['name'] field out of the dictionary. This is failing because that information isn't being returned when you make the request. Why?

Most likely because you are using an application or user access token and not a page access token. Those fields are only returned when you use the page access token.

I switched over to a page access token for a page I am an administrator on and it is working fine for me. Now I need to get a permanent access token to run against. You can find more on access tokens here.

https://developers.facebook.com/docs/facebook-login/access-tokens#pagetokens

Easiest solution is to get a permanent token for your page and create a version of the get_fb_comments_from_fb.py for each page you want to pull comments from (probably do this for get_fb_posts_fb_page.py. too and use the page access token.

If you want comments for a page you don't own, a temporary solution that worked for me was to go to line 91 and switch comment['']['name'] to a temporary value, something like ('unknown')

comment_author = unicode_decode(comment['from']['name']) comment_author = unicode_decode('test')#(comment['from']['name'])

You won't get the author information but you will get everything else. I need to look more into it and see if there is another way to get the comment author info.

HannaBjorkman commented 5 years ago

Hi everyone! Getting the same error here and I have the page access token (which has been converted to a never ending access token) that you mention @jcommaroto. Have you had a chance to look more on this? There was no issue running the post scraping script so the access token shouldn't be the problem..?