Closed negativ closed 9 years ago
@negativ thanks for letting me know about this and for the working code!
I suspect if there's a leak, it will be in inert. I'll run some stress tests. Is your client doing many connect/send/disconnects or staying connected?
With procket, it's possible to leak fd's. You might want to check:
lsof -p <pidofbeam>
In your example, inert needs to be started before being used. Otherwise, inert will return {error,closed}, the process will crash and the fd will leak.
inert can also be used to block on the listening socket before the accept.
Here are some quick changes, I'm echoing back the data for testing:
--- hawk_uds.erl.orig 2015-07-22 15:53:29.216640062 -0400
+++ hawk_uds.erl 2015-07-22 15:53:59.206640055 -0400
@@ -12,4 +12,5 @@
init() ->
+ inert:start(),
file:delete(?UDS_PATH),
{ok, Fd} = procket:socket(unix, stream, 0),
@@ -22,10 +23,17 @@
accept(Fd) ->
+ case inert:poll(Fd) of
+ {ok,read} -> ok;
+ Error ->
+ procket:close(Fd),
+ erlang:exit({accept, Error})
+ end,
case procket:accept(Fd) of
{error, eagain} ->
- timer:sleep(50); %% is there some scheduler-friendly accept/1 ?
+ ok;
{ok, ClientFd} ->
spawn_opt(?MODULE, handle, [ClientFd], [{fullsweep_after, 0}])
end,
+ error_logger:info_report({accept, 3}),
accept(Fd).
@@ -35,8 +43,9 @@
{ok, Binary} = read_packet(Sock, PackLen),
- {Func, Args} = _Req = binary_to_term(Binary),
+% {Func, Args} = _Req = binary_to_term(Binary),
- Ret = erlang:apply(hawk, Func, Args),
- BinRet = term_to_binary(Ret),
+% Ret = erlang:apply(hawk, Func, Args),
+% BinRet = term_to_binary(Ret),
+ BinRet = Binary,
BinSize = byte_size(BinRet),
@@ -52,5 +61,10 @@
read_packet_len(Fd) ->
- {ok, <<Len:32/big-unsigned-integer>>} = read_packet(Fd, 4, <<>>), {ok, Len}.
+ case read_packet(Fd, 4, <<>>) of
+ {ok, <<Len:32/big-unsigned-integer>>} -> {ok, Len};
+ _Error ->
+ procket:close(Fd),
+ erlang:exit({reason, read_packet_len})
+ end.
@@ -64,6 +78,11 @@
case procket:read(Fd, Len) of
{error, eagain} ->
- {ok, read} = inert:poll(Fd),
- read_packet(Fd, Len, Bin);
+ case inert:poll(Fd) of
+ {ok, read} ->
+ read_packet(Fd, Len, Bin);
+ _ ->
+ procket:close(Fd),
+ ok
+ end;
{ok, <<>>} ->
procket:close(Fd),
@@ -72,4 +91,7 @@
{ok, Data};
{ok, Data} ->
- read_packet(Fd, Len - byte_size(Data), <<Bin/binary, Data/binary>>)
+ read_packet(Fd, Len - byte_size(Data), <<Bin/binary, Data/binary>>);
+ Error ->
+ error_logger:info_report({read_packet, Error}),
+ procket:close(Fd)
end.
inert started at top-level supervisor - this code is a part of huge project, so i just post here example of code that cause memory leak. Client connects to server an stay connected without reconnects ~12 hours. There is no fd-leaks - code was well-tested and its ok.
By a first time i thought that my code leaks refcbins, so i did some crazy work for rewriting logic of server. =) Finally, i decide to use gen_tcp in same way as inert + procket combination and all problems just gone away =)
So far I haven't been able to reproduce this. I'm testing using 2 echo servers:
https://gist.github.com/msantos/ba8c9e443da4058ac830
https://gist.github.com/msantos/fb0accb7ce0d2e657f86
https://gist.github.com/msantos/c9b6c459e10f44ab90c2
I started the echo servers and 2 clients:
% Erlang VM 1: gen_tcp
1> xecho:listen(9999).
% Erlang VM 2: procket/inert
1> iecho:listen(8888).
% Erlang VM 3: port 9999, 10 clients, 1 ms between requests, run forever
1> xt:start(9999, 10, 1, -1).
% Erlang VM 4: port 8888, 10 clients, 1 ms delay, run forever
1> xt:start(8888, 10, 1, -1).
After connecting, the client will send 100 bytes of data, wait for the response and sleep for 1 ms in a loop.
I used pidstat to get a general idea of the size of the VMs. The result after running for a few hours:
# 23927 = xecho: gen_tcp
# 23983 = iecho: procket/inert
$ pidstat -I 60 -ru -p 23927 -p 23983
Average: UID PID %usr %system %guest %CPU CPU Command
Average: 1000 23927 17.40 12.39 0.00 7.97 - beam.smp
Average: 1000 23983 13.74 13.31 0.00 7.24 - beam.smp
Average: UID PID minflt/s majflt/s VSZ RSS %MEM Command
Average: 1000 23927 0.05 0.00 778960 35336 0.22 beam.smp
Average: 1000 23983 0.00 0.00 784128 30032 0.18 beam.smp
I am going to run a few more tests but let me know if you can think of anything I should try.
Have you tried using recon to see where the memory is going?
Same test using with hawk_uds.erl modified to echo back the packets over a unix socket:
# Erlang VM 1
1> hawk_uds:start_link("/tmp/t.s").
# Erlang VM 2: 50 clients, 1 ms delay between sends, run forever
1> xu:start("/tmp/t.s", 50, 1, -1).
Memory usage is stable with both the TCP and unix socket servers.
About your code, my guess is a process mailbox is blowing up, probably from an unhandled error message. Use sys:get_status/1 or recon to see what is going on.
Problem not in growing mailbox because i tested version which spawns new process in every new iteration of handle/1. So, i try to reproduce the problem with yours test modules today.
Ok, i found that problem was in leaking procket fd's. At some moment supervisor stop's inert and all clients that uses procket + inert go to hell. =)
Sorry for incorrect report.
This erlang module accepts connections on unix-domain socket and runs every new client in infinite loop (handle/1) in new process. Incoming message is an erlang term {Function, Args} in ETF with 4 bytes at beginning that indicates payload size. Client generates ~50rps (each 50-100 bytes long).
When i run server with 4-5 clients its memory constantly grows up (~2.5-3MBph) and never freed up. When i replace all procket code with gen_tcp all become ok and there are no memory leaks.
Is there some logic error in my code? Or memory leak in procket?