ilyaevseev / nng-deadlock

Demonstrate deadlock in nng (nanomsg next generation) library.
0 stars 0 forks source link

Need to sleep before close #1

Open gdamore opened 6 years ago

gdamore commented 6 years ago

Calling close() can cause messages to be lost before they actually get delivered, as close() doesn't "flush" the queue all the way to the peer. This is an attribute of the underlying transports.

It is therefore recommended to sleep slightly before calling close. I typically inject a 100 msec sleep here. (You can avoid sleeping if you haven't sent traffic in a while, or can reduce the sleep length based on how long ago you last sent traffic. 100 ms is just an overlong estimate, as well, which is long enough to drain both the software queue in the library, and the in-kernel queue in the IPC buffers.)

gdamore commented 6 years ago

Rather than send a PR, here's a diff:

index 49642e6..279d160 100644
--- a/nng-deadlock.c
+++ b/nng-deadlock.c
@@ -10,6 +10,7 @@
 #include <nng/protocol/reqrep0/rep.h>
 #include <nng/protocol/pair0/pair.h>
 #include <nng/protocol/pair1/pair.h>
+#include <nng/supplemental/util/platform.h>

 #include "mystuff.c"

@@ -70,10 +71,12 @@ void run_server(void)
                nng_free(recvbuf, recvsz);
                printf("server pass %d: free done.\n", pass);

+               nng_msleep(100);
                nng_call("server:xxx:close",  nng_close(loopsock));
                printf("server finished pass %d\n", pass);
        }

+       nng_msleep(100);
        nng_call("server:ctl:close",  nng_close(ctlsock));
 }
ilyaevseev commented 6 years ago

Ok, but can you take look to https://github.com/nanomsg/nng/issues/570#issuecomment-402978790 please? The problem happens even with TCP transport.

gdamore commented 6 years ago

Yes. It can happen with any transport. The root cause is the same and the fix is the same.

On Fri, Jul 6, 2018, 4:10 AM Ilya Evseev notifications@github.com wrote:

Ok, but can you take look to nanomsg/nng#570 (comment) https://github.com/nanomsg/nng/issues/570#issuecomment-402978790 please? The problem happens even with TCP transport.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ilyaevseev/nng-deadlock/issues/1#issuecomment-403004050, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPDfbAZbEz_0uSAZLrBg1_kfAomvwZCks5uD0WggaJpZM4VCvCS .