Yelp / dataset-examples

Samples for users of the Yelp Academic Dataset
http://www.yelp.com/academic_dataset
Other
1.23k stars 615 forks source link

Categories / Attributes / Hours are still nested #34

Open tootrackminded opened 7 years ago

tootrackminded commented 7 years ago

Hi all,

I'm currently having some issues with the JSON to CSV --> my main objective is to create a flat file where hours, categories, and attributes are also completely flat (and looking at the code, it seems like that is the intent of the script). However, running the code provided, the categories, attributes, and hours are still in nested form as opposed to creating new columns for each category/binary for each row.

Here's a summary of output:

image

With respect to the script, I'm using exactly what's given here (except hard coded file path instead of taking it as an input).

Any advice would be greatly appreciated!

banwarp commented 6 years ago

Did you figure out how to flatten these data frames? The JSON files have extra quotations around those data frames, and when I read them into to R using jsonlite::stream_in, they are read as characters, not data frames.