adamritter / fastgron

High-performance JSON to GRON (greppable, flattened JSON) converter
MIT License
592 stars 10 forks source link

memory usage #22

Closed setop closed 8 months ago

setop commented 8 months ago

Hi,

Thanks for this great tool. Building a faster gron was on my todo list for a long time before I came by this project, leaded by the simdjson project page. 🤘

fastgron is indeed a lot faster and less memory than gron, as advertised. congrats

Still, if I was about to develop such a tool using a pull parser like simdjson, I would expect for it to consume an almost fix amount of memory, since only the current path has to be kept in memory at any point of time.

But during my benchmarks, I see a memory usage of around 1.5x the input file.

Can you help me understand this aspect ?

Regards

adamritter commented 8 months ago

Hi setop,

thanks for the nice comment.

I'm using SIMDJSON because it's far the best JSON parser I found (I would have prefered used Rust instead if not for SIMDJSON being so great.

Unfortunately it requires the JSON to be loaded to RAM and also padded with extra 0s for SIMD parsing, otherwise I could just use memory mapping instead in most cases. If that padding issue gets fixed for SIMDJSON, memory usage goes down dramatically.