google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library
https://flatbuffers.dev/
Apache License 2.0
23.18k stars 3.24k forks source link

Memory leak in new grpc integration [C++, Ubuntu, grpc, master] #4348

Closed cberner closed 7 years ago

cberner commented 7 years ago

There appears to be a memory leak in the GRPC integration. I've reduced it down to a small example, in this repo: https://github.com/cberner/flatbuffer_leak

You can reproduce with the following steps:

  1. git clone https://github.com/cberner/flatbuffer_leak
  2. cd flatbuffer_leak && docker build .
  3. docker run --rm

Running it produces the following error (and the process quickly is OOM killed):

=================================================================
==15==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 668928 byte(s) in 2613 object(s) allocated from:
    #0 0x7fd4209f8602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x7fd4203a07ae in gpr_malloc (/usr/local/lib/libgrpc.so.4+0xc07ae)

Indirect leak of 1046201221 byte(s) in 1574 object(s) allocated from:
    #0 0x7fd4209f8602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x7fd4203a07ae in gpr_malloc (/usr/local/lib/libgrpc.so.4+0xc07ae)

Indirect leak of 648 byte(s) in 1 object(s) allocated from:
    #0 0x7fd4209f8961 in realloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98961)
    #1 0x7fd4203a0853 in gpr_realloc (/usr/local/lib/libgrpc.so.4+0xc0853)

SUMMARY: AddressSanitizer: 1046870797 byte(s) leaked in 4188 allocation(s).
./run.sh: line 5:     7 Killed                  ./parameter_server_fb
llchan commented 7 years ago

I'll take a look, but my first pass is hitting some speed bumps: no docker on my box, and gcc-4.8 means i cant use -fsanitize=leak, but I see some probably-related leaks using valgrind memcheck. Have you tried the same example using protobufs instead of flatbuffers?

aardappel commented 7 years ago

@llchan should work without Docker?

llchan commented 7 years ago

Yeah, I have it building with the Makefile + local mods, but I'm on gcc-4.8 so I have to use valgrind instead of the gcc one.

llchan commented 7 years ago

Cross referencing the original serialization traits in grpc, looks like we may be transferred ownership of the grpc_byte_buffer *buffer, rather than receiving it as a borrowed ref. This means we need to destroy it at the end of the deserialize func. Let me make some changes and see if that fixes things.

cberner commented 7 years ago

@llchan, yep my service was originally protobufs and I'm trying to port it to flatbuffers to improve throughput. The original protobufs version works fine

llchan commented 7 years ago

Was able to get docker set up on my box and could reproduce the memory leak in your example. I think the PR above fixes it, could you give it a go?

llchan commented 7 years ago

Btw, not sure how close this code is to your actual benchmark, but a few things to note:

With those two changes I see a ~2.2x increase in throughput.

cberner commented 7 years ago

Cool, thanks for the tips! Ya, that second one is just an artifact of removing a lot of code, but sizing the MessageBuilder would probably give a good speed up in my real test

aardappel commented 7 years ago

@berner it be great to hear some anecdotal numbers on what kind of speedup you're getting, if any (here's hoping the rest of gRPC does not become a bottleneck :)

@llchan now with this bug fixed, I'll do a 1.7.0 release of FlatBuffers soon, that will include this functionality, so we can get more people using this code. We may post about it.

cberner commented 7 years ago

@aardappel so far it looks like ~2x, although I'm still investigating performance bottlenecks. It seems like I should be able to get another 5x, as I'm still no where close to saturating the NIC on my machines

aardappel commented 7 years ago

@cberner That's a pretty exciting speedup already! Please keep us updated if you can.

cberner commented 7 years ago

@aardappel found the first of my issues, and have filed a ticket here: https://github.com/google/flatbuffers/issues/4354