ptwobrussell / Mining-the-Social-Web-2nd-Edition

The official online compendium for Mining the Social Web, 2nd Edition (O'Reilly, 2013)
http://bit.ly/135dHfs
Other
2.9k stars 1.49k forks source link

Facebook's Graph API v2.0 kills data mining #205

Open somada141 opened 10 years ago

somada141 commented 10 years ago

I've just started reading the 2nd edition of your book and trying out your notebook just to realize that I was only getting 3-4 friends returned.

After spending quite a bit of time looking into it I found out that the new version of the Graph API no longer allows for data on friends not using a given app to be returned. You can see the changelog at https://developers.facebook.com/docs/apps/changelog

Can you suggest any way to circumvent/overcome this? Expecting mutual authorization on an app is a pipe-dream and as far as I can see this policy change will kill data mining on facebook.

Truly looking forward to your feedback

somada141 commented 10 years ago

I should mention that there's a crazy amount of outrage of the dev-community. Take a look at the comments here: https://developers.facebook.com/blog/post/2014/04/30/the-new-facebook-login/

ptwobrussell commented 10 years ago

I'm really not sure how to respond to this one at the moment and will need to do some more digging to see if anything is going to budge at FB anytime soon or if any workarounds will become apparent. If it's indeed the new policy of Facebook to (inadvertently?) disallow data mining on friends, then the best I can probably do is revise the chapter and example code to be more in the spirit of data mining Facebook pages as opposed to friends. There's already some coverage of this in the chapter, so I'll likely have to expand it out a bit with more analysis techniques. It may take a little time to get this all sorted out, so please bear with me...

somada141 commented 10 years ago

No worries, I wasn't pointing any fingers here. From what I can see in the aforementioned link, Facebook's response to the outraged comments is "screw you the user comes first and the devs can go figure it out themselves". To be honest I won't miss the constant invites to play Candy Crush but as a dev I'm super upset having being locked out of that wealth of info. Between this and LinkedIn locking their API can you suggest any alternative sources (apart from Twitter)?

ptwobrussell commented 10 years ago

Can you elaborate a bit on why you mean by "LinkedIn licking their API"?

On Jun 24, 2014, at 11:54 PM, Adamos Kyriakou notifications@github.com wrote:

No worries, I wasn't pointing any fingers here. From what I can see in the aforementioned link, Facebook's response to the outraged comments is "screw you the user comes first and the devs can go figure it out themselves". To be honest I won't miss the constant invites to play Candy Crush but as a dev I'm super upset having being locked out of that wealth of info. Between this and LinkedIn locking their API can you suggest any alternative sources (apart from Twitter)?

— Reply to this email directly or view it on GitHub.

somada141 commented 10 years ago

eh I wrote "locking" not "licking" :) I was slightly terrified at the idea I might have misspelled that :D I was referring to their changing what TOS violation means and locking out a myriad apps using their API (see links). Thus LinkedIn is no longer playing nice with other children :). To my understanding, if you want to gain access to the full range of LinkedIn goodness you now have to be a partner and you're not getting a partnership unless you're someone big.

http://www.fullcontact.com/blog/linkedin-state-of-crm-2014/ http://techcrunch.com/2011/07/01/linkedin-cuts-off-api-access-to-branchout-monsters-beknown-and-others-for-tos-violations/

ptwobrussell commented 10 years ago

Oops, that's my bad on the typo in quoting you.

As far as I know, LinkedIn is still pretty friendly to the types of data mining outlined in Chapter 3 of MTSW. Actually, most of the exercises in the chapter use the CSV export of your contacts in an address book style format as opposed to the web services API although there are example of how to take advantage of that as well.

On Jun 25, 2014, at 6:29 AM, Adamos Kyriakou notifications@github.com wrote:

eh I wrote "locking" not "licking" :) I was slightly terrified at the idea I might have misspelled that :D I was referring to their changing what TOS violation means and locking out a myriad apps using their API (see links). Thus LinkedIn is no longer playing nice with other children :). To my understanding, if you want to gain access to the full range of LinkedIn goodness you now have to be a partner and you're not getting a partnership unless you're someone big.

http://www.fullcontact.com/blog/linkedin-state-of-crm-2014/ http://techcrunch.com/2011/07/01/linkedin-cuts-off-api-access-to-branchout-monsters-beknown-and-others-for-tos-violations/

— Reply to this email directly or view it on GitHub.

JayPeitsch5 commented 10 years ago

Hello, Have you made any head way on the facebook issue? The project that I'm working just ran into this problem and we are trying to figure out a work around. I look forward to hearing from you

ptwobrussell commented 10 years ago

@JayPeitsch5 I'm not sure that there is a workaround within the allowable terms of service. In all likelihood, I'll be revising Chapter 2 to be a little more focused on mining business pages ("Facebook Pages") a little later this summer versus focusing more on friend profiles given the direction the Social Graph 2.0 API took. Haven't thrown in the towel just yet, but it's hard to imagine a way to win here unless Facebook changes back the ToS

JayPeitsch5 commented 10 years ago

Alright, thank you, another question I had to make sure I'm understanding the 2.0 version. The App I'm working on requires a user to be able to access Facebook user profiles that they are not friends with. And be able to pull statutes, likes, groups, favorites, pictures, and more. From my understanding I can still pull the names, likes, favorites, groups, but I can't pull the statutes updates.

An additional questions, if the user (of my app) are friends with the Facebook user, they can pull everything from their Facebook book page they would like.

I apologize if this question is redundant, but I'm a to working with Api integration. On Jul 14, 2014 10:22 AM, "Matthew A. Russell" notifications@github.com wrote:

@JayPeitsch5 https://github.com/JayPeitsch5 I'm not sure that there is a workaround within the allowable terms of service. In all likelihood, I'll be revising Chapter 2 to be a little more focused on mining business pages ("Facebook Pages") a little later this summer versus focusing more on friend profiles given the direction the Social Graph 2.0 API took. Haven't thrown in the towel just yet, but it's hard to imagine a way to win here unless Facebook changes back the ToS

— Reply to this email directly or view it on GitHub https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/issues/205#issuecomment-48905446 .

ptwobrussell commented 10 years ago

The Facebook API docs are your best reference, but my understanding is that users can pretty much lock down whatever they want on their privacy settings, so in theory, a user may not even expose basic first/last name info. That's standard for a while now. Where the Social Graph 2.0 API changes come into play is that a lot more of the user profile available to an app now requires the user to have explicitly opted-in to the app. Previously, apps were able to just access a lot more info, whether or not users had opted-in to the app.

I'd recommend spending some time with the Graph API Explorer. Create a few fake accounts, opt-in to your sample app or the Graph API Explorer, and fiddle with their privacy settings to see the effects in an easy to use API console. It will probably simplify things quite a bit for you.

On Jul 14, 2014, at 10:42 AM, Jay Peitsch notifications@github.com wrote:

Alright, thank you, another question I had to make sure I'm understanding the 2.0 version. The App I'm working on requires a user to be able to access Facebook user profiles that they are not friends with. And be able to pull statutes, likes, groups, favorites, pictures, and more. From my understanding I can still pull the names, likes, favorites, groups, but I can't pull the statutes updates.

An additional questions, if the user (of my app) are friends with the Facebook user, they can pull everything from their Facebook book page they would like.

I apologize if this question is redundant, but I'm a to working with Api integration. On Jul 14, 2014 10:22 AM, "Matthew A. Russell" notifications@github.com wrote:

@JayPeitsch5 https://github.com/JayPeitsch5 I'm not sure that there is a workaround within the allowable terms of service. In all likelihood, I'll be revising Chapter 2 to be a little more focused on mining business pages ("Facebook Pages") a little later this summer versus focusing more on friend profiles given the direction the Social Graph 2.0 API took. Haven't thrown in the towel just yet, but it's hard to imagine a way to win here unless Facebook changes back the ToS

— Reply to this email directly or view it on GitHub https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/issues/205#issuecomment-48905446 .

— Reply to this email directly or view it on GitHub.

quarkRain commented 9 years ago

hi @ptwobrussell

I was reading the book (2nd edition) and ran into the above issues. Most of the sample code in the IPy Notebook doesn't return expected results.

  1. Do you have a revised Chapter 2?
  2. There are certain things that I as user am still able to do (e.g. go to a friend's page and see all their likes) that I can't do with the code you provided OR with the FB Graph Explorer tool. The fact that the info is still available to me if I manually visit the friend's page makes me wonder if there's a way to make http calls and still be able to mine/analyze FB. Any thoughts/comments on this?
  3. It seems that many things have changed from the time you published 2E and now such that a lot of things don't seem to work anymore. Is it possible for you to somehow communicate major known issues on your website so that when a reader is following the book, s/he is at least aware that certain examples don't work and can save time instead of (like me and above examples) spending time assuming that s/he is doing something wrong.

Thank you and on to Ch3 now.

ptwobrussell commented 9 years ago

1 - I don't have a revised chapter 2 yet, but have been contemplating a substantial rewrite of chapter 2 that would be more focused on analyzing pages so that the chapter isn't so obsoleted anymore.

2 - Think of facebook.com as an "application" that is using the API just like the Graph Explorer or a notebook I've provided. The reason it can do show you things is that your friends have (implicitly) authorized it to do so by opting-in (creating a FB account.) In that regard, it's just like any other FB app. You could certainly write some scripts to scrape the pages with a little effort, but it would be against the terms of service, so I can't recommend that you'd do this.

3 - I should definitely provide an update. I'd originally tried to do this early-on, but haven't kept up very well as time has moved on. FYI - At this point, to my knowledge, the only major issues are with Chapter 2 and possibly a few examples in a much later chapter involving the parsing of microformats from certain web pages.

On Fri, Feb 27, 2015 at 3:31 PM, quarkRain notifications@github.com wrote:

hi @ptwobrussell https://github.com/ptwobrussell

I was reading the book (2nd edition) and ran into the above issues. Most of the sample code in the IPy Notebook doesn't return expected results.

  1. Do you have a revised Chapter 2?
  2. There are certain things that I as user am still able to do (e.g. go to a friend's page and see all their likes) that I can't do with the code you provided OR with the FB Graph Explorer tool. The fact that the info is still available to me if I manually visit the friend's page makes me wonder if there's a way to make http calls and still be able to mine/analyze FB. Any thoughts/comments on this?
  3. It seems that many things have changed from the time you published 2E and now such that a lot of things don't seem to work anymore. Is it possible for you to somehow communicate major known issues on your website so that when a reader is following the book, s/he is at least aware that certain examples don't work and can save time instead of (like me and above examples) spending time assuming that s/he is doing something wrong.

Thank you and on to Ch3 now.

— Reply to this email directly or view it on GitHub https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/issues/205#issuecomment-76475475 .

quarkRain commented 9 years ago

@ptwobrussell - thank you for the clarifications above. So glad and thankful that you're still involved here. I had recently picked up another book on the same topic and have to say that yours is so much more accessible and the VM does make a HUGE difference by allowing readers to practice along the way. Cheers.