valyala / fastjson

Fast JSON parser and validator for Go. No custom structs, no code generation, no reflection
MIT License
2.3k stars 138 forks source link

Reduce memory consumption when parsing large objects and arrays. #101

Open iv-menshenin opened 9 months ago

iv-menshenin commented 9 months ago

We use fastjson in our project and recently I found an unpleasant issue when processing huge json files (we have 25 megabytes and more than 300 thousand elements in the array): when cache.vs reaches its limit and performs memory reallocation, if the GC fails to clean up the memory in time, we experience a sharp spike in memory usage graphs.

The funniest part is that we barely noticed this, because our application kept crashing with OOM, so we had to temporarily disable the limits for testing.

After working with pprof for a while, I found where the most memory allocations are happening and worked on improvements. I tried several options and this one turned out to be the most efficient, as shown by the tests in our environment.

If you don't accept this pull request, it's okay, then we'll just continue using our fork.

Thank you.

ysawa0 commented 9 months ago

This patch helped mitigate memory spikes in our application. The memory usage chart was clearly improved. Since this seems to be a positive change, I wonder if maintainers (cc @valyala) can consider it? Thank you.

StarpTech commented 4 weeks ago

Hi @iv-menshenin, could you provide a way to validate your improvements?

iv-menshenin commented 4 weeks ago

Hi @iv-menshenin, could you provide a way to validate your improvements?

It's working on our production nativerent as long as my PR lives. (≈500 RPS) Or you need some unit-tests ?

StarpTech commented 4 weeks ago

No, I'm asking for a simple reproduction. How did you test you improvements? Is it possible to verify this on the benchmarks of this repo? If not, can you provide one?

iv-menshenin commented 3 weeks ago

No, I'm asking for a simple reproduction. How did you test you improvements? Is it possible to verify this on the benchmarks of this repo? If not, can you provide one?

I'll see what I can do.

iv-menshenin commented 3 weeks ago

@StarpTech Here's what I found. As I have said before, my suggestion is related to GC load peaks when processing really big data. I tried multiplying the contents of the ./testdata/canada.json file so that it reaches 18 megabytes and ran the tests. Here is what I saw for valyala

BenchmarkParse/canada/fastjson-8 123 9405854 ns/op 1914.52 MB/s 37 B/op 0 allocs/op
BenchmarkParse/canada/fastjson-get-8 49 24405223 ns/op 737.86 MB/s 99548908 B/op 130551 allocs/op

Here's what happens after the improvement:

BenchmarkParse/canada/fastjson-8 136 8963098 ns/op 2009.09 MB/s 12 B/op 0 allocs/op
BenchmarkParse/canada/fastjson-get-8 111 9018742 ns/op 1996.69 MB/s 15 B/op 0 allocs/op

If required, I can do a comit with this test file to demonstrate what said

StarpTech commented 3 weeks ago

@iv-menshenin that looks promising. Is there a big performance drop for smaller files? A test file would be great. We maintain a fork which we actively maintain. We have already incorporated a lot of other changes that didn't get merged in this repository. Would you be interested in contributing to it?

iv-menshenin commented 3 weeks ago

Is there a big performance drop for smaller files?

No. There isn't There is a prior memory allocation of 32 kilobytes, but in practice due to the reuse of the parser this has only a positive impact

A test file would be great.

I will add a new case

Would you be interested in contributing to it?

I would like to.

iv-menshenin commented 3 weeks ago

@StarpTech Digging deeper into the problem again, I remembered that instead of allocations, a huge amount of memory is being consumed. After checking the benchmarks, I realized that things are bad in the tests too. Look at this var benchPool ParserPool that we are sharing between all the tests! So all the results here have become non-deterministic. And when I run the tests over and over again, I get different allocation count over and over again, then it's zero, then it's not. So there is no allocation benefit in my PR, sorry (these were bad tests). But! There is a memory consumption benefit in my PR that I can show you, just give me time =) I will do all this in your fork

iv-menshenin commented 3 weeks ago

In the meantime, I'm giving some validation in this PR

You can get this memory consumption graph via pprof if you run the tests with the following parameters:

go test --run=None --bench="BenchmarkParse/.*/fastjson" --memprofile=mem.out

image

No matter how many times I run the tests, I see 0 memory allocations as a result, but profiling gives a clearer picture of what's going on. And what is happening is consuming more than 1 gigabyte of memory

And this is what the graph looks like after my improvements

image

I think it's pretty clear