Closed mjwood7 closed 12 years ago
Suggestion - Perhaps un-nesting the if/else block.
public bool Read()
{
try
{
if (socket_.Poll(1000000, SelectMode.SelectRead)) // one-second timeout
{
int bytesRead = socket_.Receive(readBuffer_);
if (0 == bytesRead)
throw new SocketException(System.Convert.ToInt32(SocketError.ConnectionReset));
parser_.AddToStream(System.Text.Encoding.UTF8.GetString(readBuffer_, 0, bytesRead));
}
**else** if (null != session_)
{
session_.Next();
}
else
{
throw new QuickFIXException("Initiator timed out while reading socket");
}
ProcessStream();
return true;
}
catch ...
}
I don't think this is necessarily a bug. The fix protocol handles both sides of the heartbeat transaction. Take a look at this:
http://fixwiki.fixprotocol.org/fixwiki/Heartbeat
Specifically the part about when a side has not received a heartbeat in the heartbeat interval.
When either end of the connection has not received any data for (HeartBtInt + "some reasonable transmission time") seconds, it will transmit a Test Request message. If there is still no Heartbeat message received after (HeartBtInt + "some reasonable transmission time") seconds then the connection should be considered lost and corrective action be initiated.
In the scenario you gave, the counter party sending the constant traffic should notice that no heartbeat has been sent and send a test message. Assuming the test message is properly received and responded to (which it will, see Session.Next and Session.NextTestRequest), all should be good. It won't kill the connection until after the test message is sent + heartbt int + "some reasonable transmission time"
thanks and good point. I'll bring this up with party we're connecting to.
I have also included the distribution for any commentary on this:
I did some further reading, specifically from the sections you quoted. Right before the section you included was this line:
"When either end of a FIX connection has not sent any data for [HeartBtInt] seconds, it will transmit a Heartbeat message. "
source: http://fixprotocol.org/documents/346/fix-44_w_Errata_20030618_PDF.zip Volume2, Page 16, Section “Administrative Messages”:
So with the current code, my scenario where I receive constant traffic (perhaps a HeartBtInt+ second long stream of IOI messages) causes me skip a heartbeat. You are right that other end should send a test message, but in the same sense, I should be sending a heartbeat.
any thoughts?
Thanks,
-Matt
On Tue, Apr 24, 2012 at 10:42 AM, Chris Busbey reply@reply.github.com wrote:
I don't think this is necessarily a bug. The fix protocol handles both sides of the heartbeat transaction. Take a look at this:
http://fixwiki.fixprotocol.org/fixwiki/Heartbeat
Specifically the part about when a side has not received a heartbeat in the heartbeat interval.
When either end of the connection has not received any data for (HeartBtInt + "some reasonable transmission time") seconds, it will transmit a Test Request message. If there is still no Heartbeat message received after (HeartBtInt + "some reasonable transmission time") seconds then the connection should be considered lost and corrective action be initiated.
In the scenario you gave, the counter party sending the constant traffic should notice that no heartbeat has been sent and send a test message. Assuming the test message is properly received and responded to (which it will, see Session.Next and Session.NextTestRequest), all should be good. It won't kill the connection until after the test message is sent + heartbt int + "some reasonable transmission time"
Reply to this email directly or view it on GitHub: https://github.com/connamara/quickfixn/issues/57#issuecomment-5305992
Matt, IMHO, I don't think the receiving app in this case should send a heartbeat. Obviously, it is necessary for the sending app to send the TestRequest and the receiving app to respond with a heartbeat. I have seen the situation where the receiving app is a slow consumer and fails to process the TestRequest and send the heartbeat in time to prevent the sending app from disconnecting.
Jim
On Fri, Apr 27, 2012 at 8:14 AM, Matt Wood mjwood7@gmail.com wrote:
I have also included the distribution for any commentary on this:
I did some further reading, specifically from the sections you quoted. Right before the section you included was this line:
"When either end of a FIX connection has not sent any data for [HeartBtInt] seconds, it will transmit a Heartbeat message. "
source: http://fixprotocol.org/documents/346/fix-44_w_Errata_20030618_PDF.zip Volume2, Page 16, Section “Administrative Messages”:
So with the current code, my scenario where I receive constant traffic (perhaps a HeartBtInt+ second long stream of IOI messages) causes me skip a heartbeat. You are right that other end should send a test message, but in the same sense, I should be sending a heartbeat.
any thoughts?
Thanks,
-Matt
On Tue, Apr 24, 2012 at 10:42 AM, Chris Busbey < reply@reply.github.com
wrote: I don't think this is necessarily a bug. The fix protocol handles both sides of the heartbeat transaction. Take a look at this:
http://fixwiki.fixprotocol.org/fixwiki/Heartbeat
Specifically the part about when a side has not received a heartbeat in the heartbeat interval.
When either end of the connection has not received any data for (HeartBtInt + "some reasonable transmission time") seconds, it will transmit a Test Request message. If there is still no Heartbeat message received after (HeartBtInt + "some reasonable transmission time") seconds then the connection should be considered lost and corrective action be initiated.
In the scenario you gave, the counter party sending the constant traffic should notice that no heartbeat has been sent and send a test message. Assuming the test message is properly received and responded to (which it will, see Session.Next and Session.NextTestRequest), all should be good. It won't kill the connection until after the test message is sent + heartbt int + "some reasonable transmission time"
Reply to this email directly or view it on GitHub: https://github.com/connamara/quickfixn/issues/57#issuecomment-5305992
Quickfixn mailing list Quickfixn@lists.quickfixn.com http://lists.quickfixn.com/listinfo.cgi/quickfixn-quickfixn.com
Connamara Systems, LLC Made-To-Measure Trading Solutions. Exactly what you need. No more. No less. http://www.connamara.com http://connamara.com/
That last comment was from Connamara CEO @jcdowns. I think his identity theft on @mjwood7 was accidental...
Looks there can be some improvements to the Session class when it finishes processing a message. Specifically, should make heart beat check after processing message. I'll get a pull request in asap.
Hey Matt, I've got what I think is a fix in my fork on branch issue_57_fix. It's actually a one-liner, https://github.com/cbusbey/quickfixn/compare/issue_57_fix. I need to whip up some unit tests before submitting a pull request, but if you have the resources to test the fix, by all means give it a shot.
Chris, I probably won't be able to test it until Monday, but I'll let you know how it goes.
-Matt
On Fri, Apr 27, 2012 at 3:28 PM, Chris Busbey reply@reply.github.com wrote:
Hey Matt, I've got what I think is a fix in my fork on branch issue_57_fix. It's actually a one-liner, https://github.com/cbusbey/quickfixn/compare/issue_57_fix. I need to whip up some unit tests before submitting a pull request, but if you have the resources to test the fix, by all means give it a shot.
Reply to this email directly or view it on GitHub: https://github.com/connamara/quickfixn/issues/57#issuecomment-5388619
Hi Matt, let us know when you've had a chance. Will merge as soon as you confirm.
I should know by tomorrow afternoon. I'll keep you posted.
So far this looks good. I'm confident this is fixed however I'm going to continue monitoring till after the end of the production day, when we receiving a sharp increase in volume (even more so than at the start of the day).
Wunderbar. Thanks again for following up with the issue, Matt.
yeah seems to be fixed.
I believe I've found a bug:
As an initiator, if you receive constant traffic for an amount of time greater than your heartbeat interval, you'll skip the heartbeat and therefore will/should be disconnected by the remote host.
I think the logic responsible is found in SocketInitiatorThread.Read, within the nested if/then statements.