jpos / jPOS

jPOS Project
http://jpos.org
GNU Affero General Public License v3.0
607 stars 460 forks source link

Silent drop of messages on OneShotChannelAdaptorMK2 #426

Open rainer010 opened 3 years ago

rainer010 commented 3 years ago

OneShotChannelAdaptorMK2 uses one ThreadPoolExecutor with a queue of the type SynchronousQueue and It limits the number of threads. With this configuration, when the limit is reached, all subsequent messages are immediately discarded. See ThreadPoolExecutor documentation in Queuing section. Some of these solutions are recommended:

// Option 2
try {
    threadPool.execute(new Worker(m, i));
}catch (Exception e){
    sp.out(out, buildResponseWithError(m));
}
ar commented 3 years ago

Are you using the channel adapter directly, without going through a MUX, that in turn uses the Space?

rainer010 commented 3 years ago

We are using a QMUX and sending message with the QueryHost participant.

rainer010 commented 3 years ago

I add an example to explain myself better.

With this example setup:

<channel-adaptor name="jpts-channel" class="org.jpos.iso.OneShotChannelAdaptorMK2"
                 logger="Q2">
    ....
    <max-connections>128</max-connections>
    <max-connect-attempts>20</max-connect-attempts>
</channel-adaptor>

If there are already 128 active connections (threads) waiting for a response, OneShotChannelAdaptorMK2 will create a new worker if there is a pending message in its queue, this will generate an exception immediately (because we are using SynchronousQueue and there are no available threads).

...
    public void run(){
        while (running()){
            try{
                Object o = sp.in(in, delay);
                if (o instanceof ISOMsg){
                    ...
                    threadPool.execute(new Worker(m, i));
                }
            }catch (Exception e){
                getLog().warn(getName(), e.getMessage());
            }
    ...

Generally, the channel log is deactivated in prod, so we do not find out about this situation, neither does the QueryHost find out about this situation and it fails only when its timeout expires.

ar commented 3 years ago

This problem doesn't happen in the OneShotChannelAdapter, I suggest you use that one instead of the contributed MK2 variant.