wevote / WeVoteServer

We Vote's API application server written in Django/Python. Election data pulled from many sources, used by https://github.com/wevote/WebApp and https://github.com/wevote/WeVoteCordova and https://github.com/wevote/Campaigns.
https://api.wevoteusa.org
MIT License
50 stars 449 forks source link

Refresh Facebook Photo Doesn't retrieve. #869

Closed NeiloDull closed 6 years ago

NeiloDull commented 6 years ago

"Refresh Facebook Photo" tool in candidate page does not retrieve Facebook Photo despite photo existing on candidate facebook page. "Facebook photo NOT retrieved (2)." Example 1: https://api.wevoteusa.org/c/wv02cand37635/edit/?google_civic_election_id=4485&state_code=&hide_candidate_tools=False&page=0

Example 2: https://api.wevoteusa.org/c/wv02cand37000/edit/?google_civic_election_id=1000051&state_code=&hide_candidate_tools=False&page=0

SailingSteve commented 6 years ago

Yup, the screen scrape did not last long.
1) Facebook now requires you to login to see https://www.facebook.com/voterudbrowne/ 2) The first page back on the fetch of the facebook page, which is a react app, no longer contains the Profile Photo.

Time for plan c.

SailingSteve commented 6 years ago

Dale said " I think requiring the political data manager using the Facebook tools to sign in with Facebook sounds like the most elegant solution."

So then we can use the GraphAPI to get the image

SailingSteve commented 6 years ago

The solution I used was to take advantage of an old type of GraphAPI call, and making that call directly from an HTTP request. This only works for candidates who have approved facebook alias like "AdamSchiffCA" or "audreyforcongress", it does not work for unaliased individual user names like "steve.podell.39"

DaleMcGrew commented 6 years ago

Hi Steve, Neil is testing this issue and has come across this crash. Could you please take a look?:

bulk-facebook-retrieve-crash
SailingSteve commented 6 years ago

That code does not exist anymore.

DaleMcGrew commented 6 years ago

The link he is clicking "Refresh Facebook Photo" is on this page: https://api.wevoteusa.org/c/49705/edit/ Could you please update that link to use the newer code?

SailingSteve commented 6 years ago

@DaleMcGrew The exception location is scrape_facebook_photo_url_from_web_page

screen shot 2018-09-11 at 2 50 55 pm

Will you open pycharm and find that function/method?

DaleMcGrew commented 6 years ago

I'm fixing the minor bugs I'm finding when I test on the live server. Thank you for taking a look.

DaleMcGrew commented 6 years ago

Hi @SailingSteve, I fixed the crashing bugs, but I'm not able to retrieve Facebook photos for the two example candidates Neil gave (below) The error messages shown on the screen don't provide guidance for the Political Data Team. Are the photos not possible to retrieve? Is it that I'm not signed into the API server using Facebook? Any clarification would help the Political Data Team. "Refresh Facebook Photo" tool in candidate page does not retrieve Facebook Photo despite photo existing on candidate facebook page. Example 1: https://api.wevoteusa.org/c/wv02cand37635/edit/?google_civic_election_id=4485&state_code=&hide_candidate_tools=False&page=0

Example 2: https://api.wevoteusa.org/c/wv02cand37000/edit/?google_civic_election_id=1000051&state_code=&hide_candidate_tools=False&page=0

SailingSteve commented 6 years ago
screen shot 2018-09-12 at 12 15 45 pm

In Example 2, Kevin Coleman's "Facebook URL" and "Candidate Website" are reversed, and that is why there is a failure to load.

SailingSteve commented 6 years ago

Overtime Facebook has become more concerned with fake contributors and organizations mass extracting relationship information (can't imagine why).

I could do more research to confirm this, but what I have found is that personal pages like "https://www.facebook.com/steve.podell.3" ... depending on your sharing preferences can only be fully seen by your existing friends, and that the facebook api, will only show information about WeVote's Facebook friends/followers (presumably to stop Russia from extracting all friend relationships on Facebook).

There is an older API (that I use) that does not require authentication and returns pictures, but only seems to work with officially aliased facebook login names like "AdamSchiffCA" or "audreyforcongress".


1) In your browser, log out of Facebook. 2) Navigate to https://www.facebook.com/KevinColemanforStateRep/ 3) Note that you can view this aliased page without being logged in. 4) Without logging in, Navigate to https://www.facebook.com/voterudbrowne/ 5) Note that Facebook insists on a login before displaying a page -- this is the type of account that the API fails to return data for. The python app returns "Facebook photo NOT retrieved. status: FINISHED_QUERYING_GRAPHAPI_FOR_ONE_CANDIDATE".

SailingSteve commented 6 years ago

I'm 75% sure, that this is the best we can do with the current Facebook APIs available to us.

The error message now reads ... The Facebook photo was not retrieved for one of the following reasons: An invalid URL was supplied, the candidate\'s facebook page sharing settings, or the use of an un-alisased facebook user name.

DaleMcGrew commented 6 years ago

Thank you @SailingSteve!