stat157 / recent-quakes

Stat 157 Homework 2 due on Monday 2013-10-21 at 11:59pm
0 stars 20 forks source link

JSON in iPython Notebook #3

Open j-zhang opened 10 years ago

j-zhang commented 10 years ago

I installed pandas since I didn't have it previously, as well as the other packages from the preliminary set up steps.

sudo apt-get install python-pandas

I was able to reproduce the iPython Notebook when we read as a csv file, but I'm also having trouble in reading in JSON format. I chose the following feed: http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/1.0_week.geojson

But when running the cell with code

import urllib
from pandas import read_json

There was an

ImportError: cannot import name read_json

The instructions say

You should use the pandas JSON parser to read the data instead of the read_csv function in the original code.

but I'm not sure why the error is occurring. Also as I was searching online, there seem to be differences between the urllib and urllib2 packages, and sometimes people have been importing json directly and then using json.loads

import json

Am I using the wrong feed or importing the wrong packages or using the wrong functions?

Thanks!

kqdtran commented 10 years ago

I think you are doing it correctly - I got the same error on my VM too. It appears that the read_json function is only recently available with Pandas 0.12.0, whereas the one on our Ubuntu VM is 0.7.0. Running python on the command line, you can do

>>> import pandas as pd
>>> pd.version.version

to check your version of pandas. We could probably install pandas via pip instead and get the latest version that has read_json... but I would wait for @aculich 's official comment for now before doing that :-)

reenashah commented 10 years ago

I updated my pandas to the most recent version, and that seems to work in getting iPython to read the data! Thanks @kqdtran!

There still seems to be an error in reading the JSON data in iPython notebook: I keep getting ValueError: arrays must all be same length.
Is anyone encountering the same problem?

kqdtran commented 10 years ago

@reenashah Yep! I got the same error too. There is probably a way to get read_json to work, but I haven't found one yet. In the meantime, here's a hack using the built-in json module that works for me:

import urllib
import json
import pandas as pd

url = 'http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/significant_week.geojson'
d = json.loads(urllib.urlopen(url).read())

data = pd.DataFrame(d.items())
data

In the docs, read_json is said to return a Pandas object like Series or DataFrame, so I'm fairly sure this is equivalent to read_json. Hope this helps :-).

reenashah commented 10 years ago

@kqdtran you rock!! Thanks!! :+1:

GalaxyNight-day commented 10 years ago

@kqdtran thanks!