twitter-archive / cloudhopper-smpp

Efficient, scalable, and flexible Java implementation of the Short Messaging Peer to Peer Protocol (SMPP)
Other
381 stars 357 forks source link

Threads get stuck at DefaultSmppClient.createConnectedChannel #117

Open elruwen opened 8 years ago

elruwen commented 8 years ago

Aloha!

We have occasionally Threads stuck. All the thread dumps show the following:

"Sender Heartbeat 1" prio=10 tid=0x00007f0b50feb800 nid=0x7fa in Object.wait() [0x00007f0b849e2000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000caa999b0> (a org.jboss.netty.channel.DefaultChannelFuture)
    at java.lang.Object.wait(Object.java:503)
    at org.jboss.netty.channel.DefaultChannelFuture.awaitUninterruptibly(DefaultChannelFuture.java:259)
    - locked <0x00000000caa999b0> (a org.jboss.netty.channel.DefaultChannelFuture)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.createConnectedChannel(DefaultSmppClient.java:286)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.doOpen(DefaultSmppClient.java:224)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.bind(DefaultSmppClient.java:193)

When our connection drops, we unbind the SmppSession (if isBound()). The we use client.bind() again. The issue occurs sometimes, I couldn't reproduce it yet. It happens with different providers.

Maybe it makes sense as a workaround not to wait forever in com.cloudhopper.smpp.impl.DefaultSmppClient#createConnectedChannel

        // attempt to connect to the remote system
        ChannelFuture connectFuture = this.clientBootstrap.connect(socketAddr);

        // wait until the connection is made successfully
    // boolean timeout = !connectFuture.await(connectTimeoutMillis);
    // BAD: using .await(timeout)
    //      see http://netty.io/3.9/api/org/jboss/netty/channel/ChannelFuture.html
    connectFuture.awaitUninterruptibly();
    //assert connectFuture.isDone();

I am using version: 5.0.8

Cheers Ruwen

elruwen commented 8 years ago

I wanted to move this bug report to the new repo, but I can't raise issues there. Can anyone fix it?