drewnoakes / fix-decoder

Unravels FIX messages into human readable tables
https://drewnoakes.com/fix-decoder/
80 stars 34 forks source link

auto detect seperator #42

Open thawk opened 1 year ago

thawk commented 1 year ago

Because the first 3 tags in FIX message is 8/9/35, we can assume the same seperator is used in whole message, so we can detect it from these 3 tags. Because the value of 9= is always numbers, so we can use the string from the first non-number charater after 9= to 35= as seperator.

drewnoakes commented 1 year ago

Thanks for the PR! Can you share some data to test this change with, that shows a case where it improves parsing?

thawk commented 1 year ago

Because different software products different foormat of FIX log, the <SOH> will be replaced by different strings to be seen, even nul (0x01) will be used to replace soh for some reason I don't know :-(

Following is several types of log we have met:

8=FIX.4.2<SOH>9=130<SOH>35=AE<SOH>49=LSEHub<SOH>56=LSETR<SOH>115=BROKERX<SOH>34=2287<SOH>43=N<SOH>52=20120330-12:14:09<SOH>370=20120330-12:14:09.816<SOH>571=00008661533TRLO1-1-1-0<SOH>150=H<SOH>10=074<SOH>
8=FIX.4.2[SOH]9=130[SOH]35=AE[SOH]49=LSEHub[SOH]56=LSETR[SOH]115=BROKERX[SOH]34=2287[SOH]43=N[SOH]52=20120330-12:14:09[SOH]370=20120330-12:14:09.816[SOH]571=00008661533TRLO1-1-1-0[SOH]150=H[SOH]10=074[SOH]
8=FIX.4.2;9=130;35=AE;49=LSEHub;56=LSETR;115=BROKERX;34=2287;43=N;52=20120330-12:14:09;370=20120330-12:14:09.816;571=00008661533TRLO1-1-1-0;150=H;10=074;
drewnoakes commented 1 year ago

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

whatthefrog commented 1 year ago

Hi both,

indeed FIX tags 8/9/35 are some sort of usual start in messages for what I saw in my FIX years, but I also remember that FIX can have very "exotic" variations to say the least ;-)

I guess adding some sort of auto-detection is indeed nice. It would be good if the detection is not "stubborn" and behave like an extra "magic" feature if it detects a format, and just quietly does nothing (or maybe a little warning in the UI) if the format is not matching a list a pre-configured formats (that users may be able to customize?)

Sorry if I am not clear ...

On Fri, 24 Feb 2023 at 10:27, Drew Noakes @.***> wrote:

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

— Reply to this email directly, view it on GitHub https://github.com/drewnoakes/fix-decoder/pull/42#issuecomment-1443323763, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFVM3VBS37PD4MYOBVOKYTWZB5HRANCNFSM6AAAAAAUP5I4ZY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

thawk commented 1 year ago

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

In page 20 of the FINANCIAL INFORMATION EXCHANGE PROTOCOL (FIX) Version 5.0 Service Pack2, Volume1, section FIX "Tag=Value" SYNTAX. Under Message Format, rule 2 says:

The first three fields in the standard header are Begin String (tag #8) followed by BodyLength (tag #9) followed by MsgType (tag #35).

So, if we use the standard header, it should works. If not, this algorithm will fall back to one of the seperators (/\||;|\x001|\[SOH\]|<SOH>|\^A/), it extends the list of supported sperators with three multiple charaters seperators [SOH]/<SOH>/^A, which are encountered in my work experience.