quickfix-j / quickfixj

QuickFIX/J is a full featured messaging engine for the FIX protocol. - This is the official project repository.
http://www.quickfixj.org
Other
955 stars 611 forks source link

Fail over retry attempt #250

Open MuhammadhAadhil opened 4 years ago

MuhammadhAadhil commented 4 years ago

Hi All,

I'm really new to this community and this is my 1st comment. We are using QuickFix library for our application. There is a requirement received to handle the fail-over scenario for initiator session as below.

  1. If a disconnection experienced for the primary host(SocketAcceptHost), We should try the same connection for a configurable times.

  2. If all the retry attempts are failed only we need to start trying to the other hosts(SocketAcceptHost1, SocketAcceptHost2...SocketAcceptHost[N]). As of my understanding, the above requirement is not available with existing code and it seems doable while introducing new parameter to the class "IoSessionInitiator". Please be kind enough to advise me further on this.

As i mentioned, I'm really new to this community and i'm not sure this is the right place to discuss this. If it is not please someone point me to the right direction. Thanks in advance.

chrjohn commented 4 years ago

Maybe you can outline the changes that you want to introduce into the IoSessionInitiator? IMHO it would be good to introduce some kind of strategy that can easily be changed without changing QFJ itself.

MuhammadhAadhil commented 4 years ago

Hi @chrjohn, Thanks a lot for the Response. I've tried the change and tested. It is working as expected. As you mentioned Main change is in the class IoSessionInitiator. But couple of classes also need to be modified in order to introduce the New Configuration. I've Add those classes here. I would be grateful if you could able overview my changes. I've done this change on top of Version 2.0.0. If this change is acceptable please be kind enough to guide me to add this change to the Latest Code. ChangedClasses.zip Configuration&TestEvidence.zip

philipwhiuk commented 4 years ago

To submit a change so we can modify/merge it:

PetteriPertola commented 3 years ago

Hi, was this ever added as a PR / merged into a release version by any chance? I tried to look but I couldn't find anything that suggested it had. Thanks!

chrjohn commented 3 years ago

Hi @PetteriPertola , no this has not been merged since there was no PR submitted.

PetteriPertola commented 3 years ago

Hi @PetteriPertola , no this has not been merged since there was no PR submitted.

Thanks. We're seeing a similar issue: If primary host is down when starting up, then the failover mechanism of SocketConnectHost1, SocketConnectPort1 does not work, it just keeps retrying the SocketConnectHost over and over again.

chrjohn commented 3 years ago

@PetteriPertola Maybe you can take a stab at a PR?

suguiura commented 2 years ago

Hi guys,

I found a funny thing: the failover feature works sometimes, but not always.

For the failover test I created a working acceptor at port 9998; and I also created a TCP server at port 9999 to send a reset back to the initiator as soon as it connects:

final ServerSocket serverSocket = new ServerSocket(9999);

while (true) {
    try (final Socket socket = serverSocket.accept()) {
        socket.setSoLinger(true, 0);
        System.out.println("accepted " + socket);
    } finally {
        System.out.println("done");
    }
}

The TCP server above makes the initiator to try the next host:port. However, if we add a Thread.sleep(20) before closing the connection, the failover stops working.

This divergence happens when calling Net.pollConnectNow(fd) at .finishConnect() method from sun.nio.ch.SocketChannelImpl happens sooner or later depending on the response from the server.

A workaround to this problem case is to add a Thread.sleep(MILLIS) (with a reasonable MILLIS value) at the beginnig of .finishConnect(handle) method from org.apache.mina.transport.socket.nio.NioSocketConnector of the org.apache.mina:mina-core:2.1.4 dependency.

chrjohn commented 2 years ago

@suguiura Thanks for the comment but if you are suggesting a change for MINA then your best bet is to open an issue in their issue tracker. http://issues.apache.org/jira/browse/DIRMINA

MuhammadhAadhil commented 1 year ago

Hi All, I'm really apologizing for my late response. I've added the change. Please be kind enough to review. https://github.com/quickfix-j/quickfixj/pull/624