carlsednaoui / networkmill

1 stars 0 forks source link

Get a name & photo from email address #9

Open jescalan opened 12 years ago

jescalan commented 12 years ago

Facebook

So it turns out facebook made this pretty difficult to find, since they don't want scummy people the be able to use it. I think this is pretty important though, and I found a way we can use the graph api to make it happen. Since it is public data, it's definitely legal.

https://graph.facebook.com/search?q=EMAIL_HERE&type=user&access_token=AAACEdEose0cBAP7IzGZACiiBXqBBBoSnNGYHoV9A0BN4lcVrqrze4AbjOMZB4WB7ZBWdZBVbolzYaoL2sKvfukk44ASCgMgZD

The access token is mine, but we should probably generate a new random one to use in production in case they make this illegal or something so one of our personal accounts doesn't get associated with it.

If there's a match, this url will return json with the name and facebook id. We can then pull out the facebook id in order to get their profile picture using this url:

http://graph.facebook.com/FACEBOOK_ID/picture

Test this out if you want, and feel free to sub in your own access token (generate one at https://developers.facebook.com/tools/explorer/)

Example

https://graph.facebook.com/search?q=jescalan@hamilton.edu&type=user&access_token=AAACEdEose0cBAP7IzGZACiiBXqBBBoSnNGYHoV9A0BN4lcVrqrze4AbjOMZB4WB7ZBWdZBVbolzYaoL2sKvfukk44ASCgMgZD

http://graph.facebook.com/1238130848/picture

Gravatar

I also think even before facebook, we should check and see if they have a gravatar. Their api is much better and easier, and if there's a match it's a lot more reliable. All we have to do is hash the email and send off a request.

http://en.gravatar.com/site/implement/images/

We can also pull a name from gravatar using their profiles api and some clever parsing: http://en.gravatar.com/site/implement/profiles/

Between the two of these, I'd say there's a pretty solid chance we should be able to pull anyone's name and image based on their email, and save users a ton of time manually putting them in

carlsednaoui commented 12 years ago

Wow, I love your clever trick to circumvent the FB limitations. I tried to get the photo from http://graph.facebook.com/1238130848/picture but it's not working for me (might be because I'm in Turkey at the moment and the internet is horrendous...).

Also, definitely agree with you w/ regards to doing a check w/ gravatar first.

Btw, I'll look to see if LinkedIn, Twitter, Rapportive or G+ could be of use here - haven't anything helpful yet.

And, last but not least - amazing progress on the design!!

jescalan commented 12 years ago

Haha yeh the Facebook one is killer. I've been thinking about this more and had a few thoughts.

carlsednaoui commented 12 years ago

I like where this is going! And once we find a 100% match we can save that picture URL in the contact model and boom!

Btw, maybe this can help: http://rapportive.com/privacy#sources

Looking forward to finishing all of the basic stuff so that we can start working on this!

jescalan commented 12 years ago

Sweet - I went through a couple of these and picked out some pages and resources that could be useful to us:

On the other hand like you said I think these extras can come much later - we nail probably 90% of people off gravatar and facebook

carlsednaoui commented 12 years ago

Awesome find on these! Great job!!

I'm really looking forward to working on this and the Ajax parts with you :)

jescalan commented 12 years ago

We can also run a venmo search - I know hardly anyone uses it, but if they do we can grab pretty much everything off it.

https://venmo.com/api#application-api-calls

I'm somewhat interested in seeing if I can use the venmo api for other things as well but it seems limited.

Also, got a hook on mashable last night, so when we launch this it's pretty likely that we can get a mashable story written up for us : )

carlsednaoui commented 12 years ago

Ha, you're killing it bro, you're killing it!

Playing with all of the APIs listed in this thread will def. be fun :)

jescalan commented 12 years ago

Looks like there's a pretty solid chance we'll be able to make a few bogus accounts and hit the actual rapportive api as well. Here are their docs - they use jsonp, and as long as we make an account (or a few) and get a token we should be able to hit their api direct with an email address and pull down whatever info they have:

http://code.rapportive.com/raplet-docs/

This one is also a decent resource for getting a photo from a name specifically: http://www.crunchbase.com/api

jescalan commented 12 years ago

Ok so there is a way to get a twitter handle from an email address, but it's super ghetto. Here it is:

https://twitter.com/#!/who_to_follow/import

In your twitter settings, there's a checkbox that reads 'allow people to find me by my email address' which by default is checked, and most people haven't noticed. This allows your email to be associated anonymously with your account through this import contacts from mail page above.

The issue is how we would execute this. It's a very indirect association - the way twitter does it is takes your entire address book, runs it against their database for matches in the background, and returns in a random order all the accounts for which a match was hit. If we could somehow add any contact added on networkmill to the address book of one dummy gmail account linked to a dummy twitter account then run this search, pull the twitter handle, and delete the contact once the result comes in, although it would not be super fast, it might accomplish our goal.

jescalan commented 12 years ago

We also might want to give this a shot down the road when we have some premium signups:

http://www.fliptop.com/

carlsednaoui commented 12 years ago

I like the twitter hack, we'd need to test that out

jescalan commented 11 years ago

Starting to think about this a little more thoroughly, because this will be the core of how we generate any money from this guy - found a few good resources today. We're going to have to get very crafty with our scraping and api pulls, but I'm confident that we can hit matches on 90+ percent of people if we do it right.

If we could write a reliable spokeo scraper, that could be one of the best resources out there for hard-to-find email identities. It returns a full name and picture, and if we got the premium service (which is $4.00 a month, less than the cost of one user), it would give back it's shot at a bunch of social profiles.

I'm currently going through the source for the third bullet point I posted pretty thoroughly. Something smart they do in there is snag any sort of bio or description on any site they find and run it through a regex that pulls out links and email addresses. It even replaces a bunch of common "email obfuscation" techniques like "jeff [at] gmail dot com" - super simple to just detect these and replace. Also why you should never use them - just obfuscate with javascript (example: http://jenius.me/#!/contact - view source, no email).

In addition, they take a username or anything before the @ in an email address and run a search for that same name (and maybe we should do a few variations on it) across a bunch of social networks. People do tend to use the same usernames. What would be really smart would be if we could run a sort of lexical distance search - this was it would return similar names with maybe an underscore or a few letters different as well.

Spokeo also gives a lot of insight into where they search - they have a list of networks they hit and maintain a huge database that gets queried instead of a live search. For us, I don't think that's necessary, since our aim is to get the most accurate info as quickly as possible and we know which sources will more likely give back a guaranteed hit (facebook backdoor, possibly twitter, github, etc), and should hit those first and return if we have a match rather than waste time and resources on a large slow database.

I definitely think it's a good idea to do a reverse image search as well for any sort of profile image we can come up with. People very often re-use profile photos and with tools like tineye and reverse google image search, it's pretty likely we could find a few more matches on social profiles if we can find one.

Finally, there has to be some way we can scrape rapportive as well - if we can set up some sort of dummy account that when queried would automatically enter in that person as a contact, scrape the rapportive return, then remove them, it would be pretty dope.

carlsednaoui commented 11 years ago

Ok, so this issue comment here is gold. The basic backend stuff is almost all done (all that is pending is the categories for contacts). Would love to meet up with you so that we can discuss this in person and start playing with these apis you suggested.

I think the main focus would be to get the accurate name first (since we're thinking about blasting off the name field), followed by their picture. For the name, we can query Linkedin + Facebook + Twitter and if we have 2 name match out of 3 we should be more than good.

For rapportive, I wonder if going through their google extension source code will reveal anything interesting. Out of all the services we've mentioned I really think that scraping rapportive would be the way to go (the beauty is that as soon as you add a contact to a new email you get the rapportive results in the #rapportive-sidebar div) - we gots to crack their code!