Closed Oleg-Afanasiev closed 7 years ago
Thanks @Oleg-Afanasiev. @vetss @nhanth87 can you review ?
Hello @Oleg-Afanasiev
1) I have added your update https://github.com/RestComm/jss7/pull/146 into a master branch
2)
Server node was broken unexpectedly without exchanging shutdown messages. On client side we will know about lost association after heartBeatTimeInterval. But after that in m3ua part load-sharing (AsImpl->method "write") continues to send messages on active asp and inactive association. So we have to check association.isConnected() too.
Can you please comment this. After dropping of SCTP connection m3ua stack must set ASP state to anther state as ACTIVE (after AssociationListener.onCommunicationShutdown() event) and prevent of further message sending. But you are reporting that messages are being sent to an inactive ASP even after heartBeatTimeInterval. How to reproduce it ? Or may be you have an idea why we have such behavior.
We have such behavior only in concurrently environment. After dropping of SCTP connection we have next situation. SCTP level set connection to close state and m3ua level will change ASP state a little later. So we have a very little time-window when association.isConnected() == false but AspState.getState(aspFsm.getState().getName()) == AspState.ACTIVE. And we have a chance to lose several messages trying be sent from other thread. I tested client by sending requests from several hundred threads. Then I periodically switched off one of Asp nodes. Seldom (each fifth attempt of switching off in my case) I have lost only one message for this reason. So I think it isn't a critical problem but it would be useful to check association too.
Hello @Oleg-Afanasiev
I see your point. When connection losing some traffic may be lost aslo at transport level.
I see your closed PR: https://github.com/RestComm/jss7/pull/164 Do you still need this update ? If yes, please provide PR as compared with master branch.
No, now I don't need it. Thank you for discussing this problem.
Hello @Oleg-Afanasiev
ok, I am closing this issue.
Type: Enhancement
Description: Hello. I have two problems using map stack with lost messages. I send checkImeiRequests (several thousands per second) using MAPDialogMobility.addCheckImeiRequest_Huawei(...) on "client" side and get response from "server" side via SCTP channel. I am using next configuration: one client and two (or more) servers, one "AS" and two (or more) "ASP" as shown on scheme below. The client should get response from server within timeout interval.
............................................................................................/--->ASP1-->Association_1-->|Server_1| ........................................................................................./ Client -->request-->AS-->|-M3UA-(LS)-|-> .........................................................................................\ ............................................................................................--->ASP2-->Association_2-->|Server 2|
It is possible the situation when one of the server node was falled off.
1) Server node is shut-downing and inits closing association. In this case some time later (delta T) client gets service messages from server "SHUTDOWN, SHUTDOWN SACK, ..." and all next requests will be sent only through the active asp. It's ok. But in this situation I may have lost several requests sent after factually shut-downing server and before getting service messages - these non responded request mays have sent within "delta T" interval. But I must not lose those requests because there is active "asp" and the lost requests could be handled by living server node. How could I fix it manually. In TCAP part I remembered each sent packet until getting response. Each sent and received packet can be identified by transaction id. When shut-downing happens all not responded requests than was sent to the dead server have to be resend. But I need to know exactly what concrete asp/association each request was sent on. The request's packets is distributing by concrete asp in m3ua part: org.mobicents.protocols.ss7.m3ua.impl.AsImpl -> method "protected void write(PayloadData message){...}". So at least I need to have any listener for notification on which concrete asp message was sent. May be there is more graceful solution.
2) Server node was broken unexpectedly without exchanging shutdown messages. On client side we will know about lost association after heartBeatTimeInterval. But after that in m3ua part load-sharing (AsImpl->method "write") continues to send messages on active asp and inactive association. So we have to check association.isConnected() too.