Open mrocklin opened 9 years ago
BTW I'm playing with this dataset
wget http://data.githubarchive.org/2015-01-{01..30}-{0..23}.json.gz
@mrocklin We've still got some ways to go on your requests, but ndt.json.discover
is a start. I think the next step is bringing the JSON parser to Python and updating it correspondingly with some of the things you ask for. The parser is already written, so it's really just a case of adding some of the features you want in there.
OK, so I'm playing with some github data. First challenge. Find a correct datashape
Fortunately, the Python datashape has some (very non-robust) heuristics for this kind of thing
Here is a trace of a sanitized ipython session 14:30
Requests
@izaid just mentioned the following
I am totally 100% ok with this. Performance is not yet even on my radar.