gedankenstuecke / scihub_analysis

Analyzing the April 2016 Data about the Usage of Sci-Hub
http://ruleofthirds.de/analyzing-scihub-data/
MIT License
28 stars 2 forks source link

Use Google Correlate per US states #2

Open pdehaye opened 8 years ago

pdehaye commented 8 years ago

A service from Google allows to look for search terms correlating with indices over US states. What do "SciHub downloads per habitant" correlate with, at state level?

Probably lots of junk!

gedankenstuecke commented 8 years ago

Would still be a fun thing to do. The only problem: One would first need to assign each US location to a state. Is there a good API/list out there one could use for that? I guess Google Maps could allow for such a thing in principle?

pdehaye commented 8 years ago

What do you have about the US location? GPS coordinates? Almost certainly the MaxMind output includes the state, no? Did you look at her published worksheet with the data?

Otherwise, Google seems to have an API for it https://developers.google.com/maps/documentation/geocoding/intro#reverse-example but it requires a key.

There is also raw GeoJSON data for US states: http://data.okfn.org/data/core/geo-admin1-us#resource-admin1-us

gedankenstuecke commented 8 years ago

This is how a single entry looks like

2016-01-01 00:00:32 10.1063/1.1699273 56ed2c6436b2f United States East Lansing 42.7369792,-84.4838654

So it only gives city and country but also the coordinates, which would allow for using the Google API to figure out the state easily I guess.

pdehaye commented 8 years ago

Indeed, it shouldn't be too hard as long as Google's API is not too restrictive. The state explicitly comes out.

gedankenstuecke commented 8 years ago

Now only for finding the time to do this ;)