OpenSmpp / opensmpp

OpenSmpp - Java library implementing the SMPP protocol, and allowing development of External Short Message Entities (ESMEs) and more.
http://opensmpp.org/
Other
188 stars 170 forks source link

Session closing and SMSCListener stopping #33

Closed thehcma closed 6 years ago

thehcma commented 6 years ago

When stopping a Session or stopping an SMSCListener, an ongoing socket operation (reading, in the first case and accepting, in the second case) can delay the termination for one minute (since the socket timeout is fixed at 60s).

By closing the Connection object (connection) on Session.unbind() and (serverConn) on SMSCListener.stop(), the socket operation will unblock and they will terminate.

This will prevent the unnecessary wait on either end of the connection.

paoloc0 commented 6 years ago

Hi. At a glance, Session.unbind() seems to not hang around waiting unnecessarily, for the case of a clean bind shutdown (i.e. when an unbind response is received). So I presume you're saying something like "I want the session to be aggressively shut down, and not wait for an unbind response (along with any other PDUs that might still be outstanding)".

Is that correct? If so: this sounds like something that would have to be an optional feature, because not everyone will want an aggressive, unclean shutdown.

thehcma commented 6 years ago

With some test code, this is not what I observe - I am not forcing an unclean unbind, it should be business as usual.

Obviously, it's possible I am doing something wrong, but below is the log of a very simple test that shows the problem.

I have a client sending an unbind request, which the server responds to almost immediately (see the logs below), yet a minute goes by until the unbind call returns:

19:20:53 =================== before unbind
19:20:53 [mylogger] server response: (submit_resp: (pdu: 0 80000004 0 3) Smsc2001 ) 
19:20:53 [mylogger] client response: (deliver_resp: (pdu: 17 80000005 0 4)  ) 
19:20:53 [mylogger] client request: (unbind: (pdu: 16 6 0 5) ) 
19:20:53 [mylogger] server response: (unbind_resp: (pdu: 0 80000006 0 5) ) 
19:21:53 =================== after unbind(unbind_resp: (pdu: 16 80000006 0 5) ) 

Looking at what the process is doing, I see the following thread stacks:

This one is stuck on a socket read call (with a 60s timeout):

"Receiver-1" #19 prio=5 os_prio=0 tid=0x00007fdfb4001000 nid=0x657a runnable [0x00007fe008839000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    - locked <0x00000000ec54ad30> (a java.io.BufferedInputStream)
    at org.smpp.TCPIPConnection.receive(TCPIPConnection.java:408)
    at org.smpp.ReceiverBase.receivePDUFromConnection(ReceiverBase.java:197)
    at org.smpp.Receiver.receiveAsync(Receiver.java:358)
    at org.smpp.ReceiverBase.process(ReceiverBase.java:96)
    at org.smpp.util.ProcessingThread.run(ProcessingThread.java:199)
    at java.lang.Thread.run(Thread.java:748)

and the thread executing the unbind is wait for the socket read above to terminate:

"main" #1 prio=5 os_prio=0 tid=0x00007fe03400a800 nid=0x6556 waiting on condition [0x00007fe03d08b000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.smpp.util.ProcessingThread.stop(ProcessingThread.java:167)
    at org.smpp.Receiver.stop(Receiver.java:219)
    at org.smpp.Session.unbind(Session.java:506)

Hence, the timeout of 60s (which as far as I can tell cannot be programmatically changed) holds the unbind causing an unnecessary delay.

Interestingly, I see a similar behavior on the server side, but the invocation of a stop in the listener is held up by a socket accept call.

paoloc0 commented 6 years ago

Hmm, I tested it a few times with org.smpp.test.SMPPTest, and found that it intermittently (but more often that not) locks up for around 60s on unbind. Ok, will look further when I get a chance.

paoloc0 commented 6 years ago

I've added some tests in ClientSessionUnbindIT that demonstrate the effect of various settings on slow shutdown of both client and server sessions. There are various code paths and possibilities (synchronous, asynchronous), but for the most common usage, I believe that it comes down to:

There are more elaborate solutions (changing the socket read loops to not block in the current way; replacing the socket implementation with one that can be interrupted closing the underlying sockets as you proposed; re-architecting things more dramatically), but I don't think we're going to go that way in a mature library, given the limited benefits.

paoloc0 commented 6 years ago

I decided to decrease the defaults for COMMS_TIMEOUT and QUEUE_TIMEOUT. Hopefully this resolves the issue.

thehcma commented 6 years ago

Thanks @paoloc0 - I still like my solution better (I hacked it using reflection) :), but I guess this will do. Do you intend to bump the version?

paoloc0 commented 6 years ago

Fair enough :)

I've released v3.0.2.