How to convert to string?

suhasghorp commented 1 year ago

Hello, Thanks for the awesome library. One quick question - once i have the byte buffer after doing the encode(), how do I convert it to a string so that or "\x01" characters are converted to "^A" characters ? I tried various different ways to do this, like repr(byte_buffer) or (ord(x) in bye_buffer) etc. but could not. My bytes array looks like below - b'8=FIX.4.4\x019=65\x0135=0\x0152=20220809-21:37:06.893\x0149=TRADEWEB\x0156=ABFIXREPO\x01347=UTF-8\x0110=045\x01' Any idea ? Thanks

da4089 commented 1 year ago

I think that bytes array is actually correct.

b'\x01' is Python's way of describing a byte with an integer value of 1. '^A' is an alternative representation (not understood by Python) of the same thing: it represents the keystroke Control-A, which used to (and might even still do in some cases) generate a character with an integer value 1 at a terminal. This is also sometimes referred to as SOH -- again, a byte with an integer value of 1, but in this case that's the ASCII abbreviation for Start Of Header, which was what that byte meant in standard ASCII.

I don't think this is what you need, but if for some reason you actually need a string with the '^' and 'A' as two separate characters representing the field separator, you could use the replace() function (defined for both string and bytes types), like b'8=FIX.4.4\x019=65\x0135=0\x01'.replace(b'\x01', b'^A') and it will replace that for you. But, I think it's unlikely this is what you want.

If you just want to convert the bytes type (suitable for sending to a socket, for instance) to a string (suitable for writing to a log file or something), you can use the decode() function, available on bytes instances. It will produce a UTF8-format string by default, but will retain the "\x01" representation of the separator byte, because that's how Python shows it.

Hope this helps!

suhasghorp commented 1 year ago

Thanks! You are correct, I just need to store the FIX message as a string to another file for an application to process which is expecting "^A" as the delimiter. If I use decode, it seems that I lose the "\x01" chars all together. This is string I get after decode -

8=FIX.4.49=6535=052=20220809-23:09:56.19649=TRADEWEB56=ABFIXREPO347=UTF-810=047

Is this expected? Or the "\x01" chars are present in the string but are unprintable ? Thanks again.

da4089 commented 1 year ago

Yes, it really depends on where the result is being displayed: the SOH character doesn't have a standard representation, which is why some things show "^A", and Python uses "\x01", and others just show nothing.

It's pretty common in FIX logging, for instance, to replace them with the vertical bar character for exactly this reason.

In fact, built into simplefix is a function that might be useful: simplefix.pretty_print(). Pass it the bytes object, and it'll do the replacement for you. It defaults to a | char, but you can pass an alternative as the second parameter if you prefer something else.

Help on function pretty_print in module simplefix:

pretty_print(buf, sep='|')
    Pretty-print a raw FIX buffer.

    :param buf: Byte sequence containing raw message.
    :param sep: Separator character to use for output, default is '|'.
    :returns: Formatted byte array.

suhasghorp commented 1 year ago

Thank you, this function resolves my issue.

byte_buffer = message.encode() str_buffer = simplefix.pretty_print(byte_buffer, '|').decode('utf-8').replace('|','^A') print(str_buffer)

This prints - 8=FIX.4.4^A9=65^A35=0^A52=20220810-12:39:49.280^A49=TRADEWEB^A56=ABFIXREPO^A347=UTF-8^A10=036^A

da4089 commented 1 year ago

Great that you've got it working. I think you could make one further simplification:

str_buffer = simplefix.pretty_print(byte_buffer, '^A').decode('utf-8')
print(str_buffer)

Doesn't save much, but ... you can have pretty_print do the replace for you.

suhasghorp commented 1 year ago

I tried that first but got the following error. It looks like the "sep" parameter is expecting a char and "^A" is considered two chars.

Traceback (most recent call last): File "test_fix.py", line 14, in <module> str_buffer = simplefix.pretty_print(byte_buffer, '^A').decode('utf-8') File "/pylocal/simplefix/__init__.py", line 43, in pretty_print cooked[i] = ord(sep) if value == 1 else value TypeError: ord() expected a character, but string of length 2 found

Thanks.

da4089 commented 1 year ago

Ah ... that feels like a bug. I'm going to reopen this issue, and investigate further, but ... it seems like it'd be much more useful if it accepted a string (or bytes) there?

I think this is in fact the very first bit of code that subsequently became simplefix. I haven't looked at it for ages.

da4089 / simplefix

How to convert to string? #40