indexer.py memory bloat in parsing large results.json file

I tried to do some investigation around this issue. There are couple of points

We need an iterative streaming parser for JSON files that are large enough for in memory loading.
However, when we are parsing iteratively we might have to change some of the behavior in our code, it would be tricky to do something like these https://github.com/distributed-system-analysis/pbench/blob/d8f835dd81abf6084c807c5caa507ceb34f9fae6/lib/pbench/server/indexer.py#L133-L134 and https://github.com/distributed-system-analysis/pbench/blob/d8f835dd81abf6084c807c5caa507ceb34f9fae6/lib/pbench/server/indexer.py#L170-L186 this is where we try to get all the keys at once or we use the extracted json dictionary and put it into another dictionary (template body). I think this would mean loading the file again into the memory since we can not add iterative object into the template body
There is a nice python package called ijson which is built on the popular YAJL json iterative parser library, but using it means we would rely on 3rd party library or otherwise we might have to write our own wrapper similar to ijson (might require significant efforts in it). Any ideas about implementing a wrapper are welcome
Does it make sense if we could use sqllite like database in future if json files starts to grow quite big? so that way we dont have to load everything in memory and can store everything on disk

Thoughts?

distributed-system-analysis / pbench