pebbe / zmq4

A Go interface to ZeroMQ version 4
BSD 2-Clause "Simplified" License
1.17k stars 163 forks source link

Profiling: High memory allocations #116

Closed omani closed 6 years ago

omani commented 6 years ago

Hi,

I'm profiling my application which uses the mdp (majordomo) pattern and I can see a lot of memory allocations on side of the worker. The test run is a run of 1000000 (one million) messages.

according to iptraf I get a throughput of 680Mbps and 150kpps. but when I saw a rizing in memory by looking at htop I wanted to investigate more. my first guess was that the broker was doing many (perhaps unnecessary allocs) but I figured out it is the worker. so after activating profiling in go I can report following results/reports:

worker pprof:

(pprof) top 
35192815 of 39609885 total (88.85%)
Dropped 23 nodes (cum <= 198049)
Showing top 10 nodes out of 34 (cum >= 25488476)
      flat  flat%   sum%        cum   cum%
  11878750 29.99% 29.99%   11878750 29.99%  runtime.convT2E
   8618115 21.76% 51.75%   17138055 43.27%  github.com/pebbe/zmq4.(*Socket).SendBytes.func1
   5225762 13.19% 64.94%    5225762 13.19%  runtime.stringtoslicebyte
   2310190  5.83% 70.77%    2310190  5.83%  runtime.rawstringtmp
   1925187  4.86% 75.63%    4006001 10.11%  github.com/pebbe/zmq4.(*Socket).RecvBytes
   1146897  2.90% 78.53%    1146897  2.90%  runtime.convT2I
   1114129  2.81% 81.34%    2080814  5.25%  github.com/pebbe/zmq4.(*Socket).RecvBytes.func2
   1015823  2.56% 83.91%    1703961  4.30%  github.com/pebbe/zmq4.(*Socket).getInt
    983055  2.48% 86.39%    3752004  9.47%  github.com/pebbe/zmq4.(*Poller).poll
    974907  2.46% 88.85%   25488476 64.35%  github.com/omani/kimchi-worker.(*Mdwrk).Send
(pprof) 

this line:

      flat  flat%   sum%        cum   cum%
  11878750 29.99% 29.99%   11878750 29.99%  runtime.convT2E

seems to alloc more and more when long running the worker.

2017-10-10-145540_1280x1015_scrot I've highlighted convT2E purple. notice the amount of time it's present in the flamegraph and thus plays a relevant role in my critical path (one roundtrip of message passing).

relevant code in pebbe/zmq4:

/*
Send a message part on a socket.

For a description of flags, see: http://api.zeromq.org/4-1:zmq-send#toc2
*/
func (soc *Socket) SendBytes(data []byte, flags Flag) (int, error) {
    if !soc.opened {
        return 0, ErrorSocketClosed
    }
    d := data
    if len(data) == 0 {
        d = []byte{0}
    }
    size, err := C.zmq_send(soc.soc, unsafe.Pointer(&d[0]), C.size_t(len(data)), C.int(flags))
    if size < 0 {
        return int(size), errget(err)
    }
    return int(size), nil
}

If I stop the worker the memallocs are freed again of course. so I guess this is not a memory leak, more a normal behavour.

so the question is: how can I save memory here? my aim is to get 1Gbps on a moderate server (which is quite doable) but oom issues are preventing me from doing so. golang seems to intefere here by kicking in runtime.convT2E. or maybe I am interpreting something wrong here? what could cause a rizing alloc with even more messages? do we see any possible improvements on the relevant zmq4 snippet above or does it play any role here? any hints are welcome.

thanks.

pebbe commented 6 years ago

Socket.SendBytes doesn't allocate any memory, does it?

Perhaps people can help on the mailing list

omani commented 6 years ago

Ah, totally my fault. I've set the high water mark to infinite :|

now I have no more oom issues.

thanks.