I have this massive json file. and I run out of memory when trying to read it in to Python. How would I implement a similar procedure using ijson?
import pandas as pd
#There are (say) 1m objects - each is its json object - within in this file.
with open('my_file.json') as json_file:
data = json_file.readlines()
#So I take a list of these json objects
list_of_objs = [obj for obj in data]
#But I only want about 200 of the json objects
desired_data = [obj for obj in list_of_objs if object['feature']=="desired_feature"]
Basically, the file is a list of json objects. I want a list of json objects where the objects all have a certain value for a particular key. For such json objects, I want to include every attribute.
The file itself contains a list of objects like:
{
"review_id": "zdSx_SD6obEhz9VrW9uAWA",
"user_id": "Ha3iJu77CxlrFm-vQRs_8g",
"business_id": "tnhfDv5Il8EaGSXZGiuQGg",
"stars": 4,
"date": "2016-03-09",
"text": "Great place to hang out after work: the prices are decent, and the ambience is fun. It's a bit loud, but very lively. The staff is friendly, and the food is good. They have a good selection of drinks.",
"useful": 0,
"funny": 0,
}
(x-post from Stack Overflow)
I have this massive json file. and I run out of memory when trying to read it in to Python. How would I implement a similar procedure using ijson?
Basically, the file is a list of json objects. I want a list of json objects where the objects all have a certain value for a particular key. For such json objects, I want to include every attribute.
The file itself contains a list of objects like: