FIXTradingCommunity / fixp-specification

FIXP - FIX performance session layer specification
Other
48 stars 17 forks source link

Out of Band Recovery #24

Open adkapur opened 7 years ago

adkapur commented 7 years ago

To facilitate out of band recovery for previously used UUID's we might need to enhance the RetransmitRequest to add an additional field such as OnBehalfOfSessionID since we cannot expect the same UUID to be used to Negotiate another FIXP session to recover messages sent previously with that same UUID as the sequence stream will overlap.

kleihan commented 7 years ago

Thread copied from iMeetCentral to be continued here:

Li Zhu There are many ways to do out of band recovery, perhaps even not above FIXP.

Even if you use FIXP, you can also wrap the OOB recovery as a special application service, thus you can embed the request and the recovered messages all in APP msgs.

Aditya Kapur But if application sequencing is not being used and only FIXP sequencing is being used then currently it is not possible to support out of band recovery for a previously used UUID

Hanno Klein Isn't that your choice, i.e. whether you use application sequencing of FIXP sequencing? The choice should depend on your requirements and use cases. What forces you to use FIXP sequencing? Cross-session recovery may require application sequencing.

Aditya Kapur To force customers to use application sequencing to recover messages for a prior FIXP session is to place an extra burden on them and that is not acceptable. If we use only FIXP sequencing all the way then it must work for this use case also.

kleihan commented 7 years ago

Would you agree that application level recovery is something else than session level recovery? If you agree then you are not trying to duplicate a feature available on the application level to the session level. This is how it sounds when you say that using application sequencing is an extra burden for the user and he should be able to do it (=application sequencing) with FIXP.

I looked into FIXP chapter 3.5.3 Session Lifetime. I understand "After finalization, the old session ID is no longer valid, and messages are no longer recoverable." to mean that FIXP messages from one session cannot be recovered over a different session. But you can re-establish a lost session: "However, a client may reconnect and bind the existing session to the new transport. When re-establishing an existing session, the original session ID continues to be used, and recoverable messages that were lost by disconnection may be recovered.".

Maybe I am wrong but it sounds like a conceptual problem, i.e. should there be sth like a "RelatedSessionID" to be able to make a reference across sessions for any purpose? From the viewpoint of FIXP, messages coming over a session have no application context. The mechanisms to get them to the receiver include retransmission and it does not require a second session. If you see two FIXP session as channels for the application where one serves as a recovery channel for the other, then you are adding this application context and that is where I think such a feature crosses the line. Session level recovery is conceptually bound to the recovery of messages within a single session. Application level recovery is agnostic of the number of sessions used.

For example, in case of market data one sends the data twice over two UDP sessions and hopes that both sessions will never fail at the same time. Allowing one session to request retransmission of missed messages from the other session only makes sense from an application level view where one would know that both sessions actually contain the same data. From a session level view one would assume that data is different and see no value in retransmitting messages from another session interspersed with the "normal" messages of that session.

adkapur commented 7 years ago

Hanno:

Would you agree that application level recovery is something else than session level recovery? If you agree then you are not trying to duplicate a feature available on the application level to the session level. This is how it sounds when you say that using application sequencing is an extra burden for the user and he should be able to do it (=application sequencing) with FIXP.

I am saying that if the exchange uses only FIXP and not application sequencing then FIXP should support an out of band recovery mechanism. I don't want the exchange to support recovery at both FIXP and application layer, I want to do it only in FIXP.

I looked into FIXP chapter 3.5.3 Session Lifetime. I understand "After finalization, the old session ID is no longer valid, and messages are no longer recoverable." to mean that FIXP messages from one session cannot be recovered over a different session. But you can re-establish a lost session: "However, a client may reconnect and bind the existing session to the new transport. When re-establishing an existing session, the original session ID continues to be used, and recoverable messages that were lost by disconnection may be recovered.".

This is correct but the idea was always to allow recovery of a previous FIXP session out of band through another gateway in case of disaster recovery etc. An intraday FIXP session could be re-established with the same UUID and proceed as usual.

Maybe I am wrong but it sounds like a conceptual problem, i.e. should there be sth like a "RelatedSessionID" to be able to make a reference across sessions for any purpose? From the viewpoint of FIXP, messages coming over a session have no application context. The mechanisms to get them to the receiver include retransmission and it does not require a second session. If you see two FIXP session as channels for the application where one serves as a recovery channel for the other, then you are adding this application context and that is where I think such a feature crosses the line. Session level recovery is conceptually bound to the recovery of messages within a single session. Application level recovery is agnostic of the number of sessions used.

Yes we had previously talked about adding an on-behalf UUID or something similar such as RelatedSessionID to the Retranmission Request. Retransmission requires a second FIXP session because it is out of band on another gateway outside of the ciritical order entry path. I don't think there is any conflict here because session level recovery is still bound to a particular session only but it is just being entered on behalf of another session if you will.

For example, in case of market data one sends the data twice over two UDP sessions and hopes that both sessions will never fail at the same time. Allowing one session to request retransmission of missed messages from the other session only makes sense from an application level view where one would know that both sessions actually contain the same data. From a session level view one would assume that data is different and see no value in retransmitting messages from another session interspersed with the "normal" messages of that session.

Actually the session being used to request retransmission of missed messages from another session cannot be used to send business messages itself since its sole purpose is for recovery only so it is an empty session which exists as a placeholder and is being used only for recovery of other session's messages and both belong to the same business entity (think of this almost like an entering session performing recovery for an executing session).

kleihan commented 7 years ago

My view on this is that you are trying to do things with FIXP which are beyond the scope of the recovery capability intended for FIXP. This limitation is intentional to keep it simple. At least for version 1.0, i.e. the discussion should be post-poned to a future version of FIXP.

adkapur commented 7 years ago

Actually it was supposed to be in RC3 but I was the one who took it out since there was no urgency at that time but now we are actively looking at this and I think that all it requires is the addition of a new field called RelatedSessionID to the RetransmitRequest, Retransmission (response) and RetransmitReject messages.

donmendelson commented 6 years ago

Still an open topic after RC4.