connamara / quickfixn

QuickFIX/n implements the FIX protocol on .NET.
http://quickfixn.org
Other
478 stars 562 forks source link

Field group got broken on message resend #875

Closed joaocpribeiro closed 2 months ago

joaocpribeiro commented 3 months ago

Version: 1.11.2

Steps:

Current behaviour:

Expected behaviour:

Example data:

FIX message that was attempted to be sent while there were network issues: 8=FIXT.1.1|9=311|35=D|34=2278|49=EXAMPLE_INIT|52=20240812-14:38:39.223|56=EXAMPLE_ACCEPT|11=9560967b495d48cb8d3611cc2c61612d|15=EUR|21=1|22=4|38=5|40=2|44=50|48=DE0000000000|54=1|55=DE0000000000|59=6|60=20240812-14:38:38.913|100=MUNC|432=20250807|1133=G|10050=0000000|453=1|448=EXAMPLE_PARTY|447=D|452=24|802=1|523=K0000000000|803=22|10=146|

FIX message that was sent after SequeceReset: 8=FIXT.1.1|9=342|35=D|34=2278|43=Y|49=EXAMPLE_INIT|52=20240812-14:57:35.406|56=EXAMPLE_ACCEPT|122=20240812-14:38:39.223|11=9560967b495d48cb8d3611cc2c61612d|15=EUR|21=1|22=4|38=5|40=2|44=50|48=DE0000000000|54=1|55=DE0000000000|59=6|60=20240812-14:38:38.913|100=MUNC|432=20250807|447=D|448=EXAMPLE_PARTY|452=24|453=1|523=K0000000000|802=1|803=22|1133=G|10050=0000000|10=156

Please ignore the wrong CheckSum, I had to change some data from the messages. Please focus on tag 453 for NoPartyIDs group. This got broken.

Note that the messages do not have the field groups broken when the first sending attempt is successful.

gbirchmeier commented 2 months ago

When you first send any given message, the engine doesn't do any processing or verification on it. You created the message object, and the engine only calls ToString()* to get the string to send, and then it sends it. However, the message goes into the store as just as just a string.

*Note: a recent PR changes Message.ToString() to Message.ConstructString()

On the resend, the engine pulls that message string out of the store, and then re-constructs a Message object from it. NOW it uses a DD for construction. If the DD is configured correctly, no problem. If the DD is not, then the reconstruction is going to go badly, and that's what you're seeing.

So I believe this is occurring at Session.NextResendRequest(), where it calls

msg.FromString(msgStr, true, SessionDataDictionary, ApplicationDataDictionary, _msgFactory, ignoreBody: false);

Somehow for you, I think ApplicationDataDictionary is null. (Maybe SessionDataDictionary is too?)

So... what does your configuration look like?

joaocpribeiro commented 2 months ago

@gbirchmeier, my config looks like the following:

[DEFAULT]
ConnectionType=initiator
ReconnectInterval=5
FileStorePath={{FixFilesPath}}
FileLogPath=log
StartTime={{FixStartTime}}
EndTime={{FixEndTime}}
TimeZone={{FixTimeZone}}
UseDataDictionary=N
SocketConnectHost={{SocketConnectHost}}
SocketConnectPort={{SocketConnectPort}}
LogoutTimeout=5
ResetOnLogon=N
ResetOnDisconnect=N

[SESSION]
BeginString=FIXT.1.1
DefaultApplVerID=FIX.5.0
SenderCompID={{SenderCompID}}
TargetCompID={{TargetCompID}}
HeartBtInt=30

Some of the values are defined on runtime, but I believe that those have no impact on this issue. You referred Session.NextSendRequest(). Is this something I can catch on my code? Unfortunately this is a scenario that I cannot test easily. It was reported once after the Acceptor had issues on the connection. Since I had no answer in this thread for some time, I started preventing new NewOrderSingle messages when the session is off, but it might happen in the future that the session seems to be on and in the next step the Acceptor has issues already...

gbirchmeier commented 2 months ago

Ok, this is your problem:

UseDataDictionary=N

If you don't use a DD, then your app cannot correctly parse any message that has repeating groups in it, including those received from your counterparty. I don't know what state your project is in, but I'm surprised you haven't had problems well before this.

It's right there in the docs: image

As I mentioned above, resends involve re-parsing the string message from the store. No DD? Then you can't correctly parse it.

Frankly, I don't know why we even allow UseDataDictionary=N. In my 15 years working with the various QF versions, I have never used it. There is no counterparty I have ever connected to that would work without a DD.

So:

I'm going to close this issue, as it's clear to me now that there's not a QF bug here.

gbirchmeier commented 2 months ago

Oh, your other question:

You referred Session.NextSendRequest(). Is this something I can catch on my code?

No, this is internal engine code. I was just giving notes on my investigation. (And it's actually NextResendRequest(), I corrected that above)

joaocpribeiro commented 1 month ago

@gbirchmeier Thank you! We are still not in production, that's why we haven't had problems before. I started talks with my counterparty to check if they have something I can use to create my data dictionary. I will write again once we have it in place.