Closed MikeHathaway closed 3 years ago
Could you provide an explanation of the change? The code this PR removes seems to remove filled orders from the local order book, which would presumably help prevent attempts to cancel them.
Sure, so one issue was that the method used to assign incoming messages to a given orders queue was incomplete. Rejected Order Cancel Requests are returned with 35=9
, not 35=8
. This required updating _handle_application_message
to place both message types into a given order queue.
The second issue, and the cause of the restarts, was due to timing. The price would shift and exceed band targets, prompting the client side order book to send an order cancellation request. Before this cancellation could be processed, the order would be filled. Given the way we make order cancellations synchronous, this order fill would instead be handled by the waiting cancel_order
method. It was not previously structured to handle filled orders in the circumstance, so it would tell the client (mm-keeper) order book that the cancellation failed, prompting it to retry. This retry would then hang forever (until it exceeded the timer threshold for restarting), due to the first issue where the rejection message wasn't properly queued for that order id
This PR should enable us to process these fills without requiring the keeper to entirely restart by telling the mm-keeper order book that the filled order is now unknown on the side of ErisX, and therefore should be removed (enabling us to avoid sending an unnecessary second cancellation request). The subsequent cancellation failure message would then be appended onto the order queue, but it would no longer read/used by either client or server orderbooks (I figured this cost would be minimal as it would be wiped on any restart, and logic to clean up the fix queues would be potentially expensive and complicated).
All restarts in the logs I saw were tied to these fills occurring immediately after a cancellation request, but immediately before the cancellation response.
Could you provide an explanation of the change? The code this PR removes seems to remove filled orders from the local order book, which would presumably help prevent attempts to cancel them.