epics-base / pvAccessCPP

pvAccessCPP is an EPICS V4 C++ module
https://epics-base.github.io/pvAccessCPP/
Other
10 stars 22 forks source link

CMD_SET_ENDIANESS in protocol spec #183

Open mdavidsaver opened 2 years ago

mdavidsaver commented 2 years ago

The description of the CMD_SET_ENDIANESS control message currently says:

The 7-th bit of a header flags field indicates the server's selected byte order for the connection on which this message was received. Client MUST encode all the messages sent via this connection using this byte order.
The client's decoding byte order depends on the payload size field value as follows:

Payload Size Field Value | Meaning
-- | --
0x00000000 | Client MUST decode all the messages received via this connection using server's selected byte order.
0xFFFFFFFF | Client MUST decode all the messages sent received this connection as indicated by each message byte order flag.

Which it seems I've been confused by. I know that the pvAccessCPP/Java codes only implement the first part of this by inspecting flags&0x80. The size field is never tested.

https://github.com/epics-base/pvAccessCPP/blob/master/src/remote/pv/codec.h#L310-L315

https://github.com/epics-rip/pvAccessJava/blob/4e62c8646bf8bea2e0e60fe6d194636dd0e4d648/src/org/epics/pvaccess/impl/remote/tcp/NonBlockingTCPTransport.java#L259-L263

In addition, both servers always send size=0.

https://github.com/epics-rip/pvAccessJava/blob/4e62c8646bf8bea2e0e60fe6d194636dd0e4d648/src/org/epics/pvaccess/server/impl/remote/tcp/NonBlockingServerTCPTransport.java#L271

I don't think the 0xFFFFFFFF behavior has ever worked.

So what to do? Should I remove mentions of 0xFFFFFFFF?

@kasemir Thoughts? Is your new server doing anything with the size field in this case? How are you handling message endianness on TCP?

fyi. The root cause of mdavidsaver/p4p#83 seems to be that I apparently got the meanings of 0x00000000 vs. 0xFFFFFFFF backwards.

kasemir commented 2 years ago

In the new Java implementation, I check the byte order flag in each message and act accordingly, https://github.com/ControlSystemStudio/phoebus/blob/5c840360eaf8c395bf83cb9520b422426541de7f/core/pva/src/main/java/org/epics/pva/common/PVAHeader.java#L187

So that's basically the 0xFFFFFFFF behavior, "Client MUST decode all the messages sent received this connection as indicated by each message byte order flag." So each received message is checked until we reach the byte order bit, and from then on we read it in the requested order. Messages sent to the server use the byte order that the server had in its CMD_SET_ENDIANESS message, https://github.com/ControlSystemStudio/phoebus/blob/5c840360eaf8c395bf83cb9520b422426541de7f/core/pva/src/main/java/org/epics/pva/client/ClientTCPHandler.java#L309

I never check the 'size' part. If it says 0x00000000 "Client MUST decode all the messages received via this connection using server's selected byte order.", I don't do that, I use the byte order given in each message, not the one sent in the initial SET_ENDIANESS.

I suggest to update the CMD_SET_ENDIANESS/set-byte-order-0x02 info: The server uses it to specify in which byte order the client should reply. It is highly likely that the server will also send all messages in that byte order, but each message contains its own byte order indicator, so both server and client should be prepared to decode each message as its received with the byte order indicated in each message.

mdavidsaver commented 2 years ago

So that's basically the 0xFFFFFFFF behavior ...

Well, at least I'm in good company.

The server uses it to specify in which byte order the client should reply.

I don't think can be any "should" here. I've looked, and the pvAccessCPP peer really doesn't check the endian flag in any header other than the CMD_SET_ENDIANESS. So this is a "must", or the server won't decode properly. Which is what happens with PVXS and the pvAccessJava server now.

I'm going to see if I can setup pvAccessCPP on some big endian target to verify that it doesn't work either. I don't think it will though.

I certainly think the spec. must mandate consistency between the header flag and the body encoding.