Closed griffiri closed 1 year ago
Regarding the first question, I might not have clearly noted in the document, but HftBacktest is a market data replay-based backtesting system, so your order cannot affect any changes to the replay market data. The crucial assumption here is that your order is small enough not to make any market impact. Therefore, your order is not reflected in the market depth.
For example, consider the following market depth: ask 2 @ 102 bid 1 @ 100
If you submit a buy order of 1 @ 100, the market depth remains: ask 2 @ 102 bid 1 @ 100
If you submit a buy order of 1 @ 101, the market depth still remains: ask 2 @ 102 bid 1 @ 100
Until your order is matched by the opposite best bid or ask or a trade occurs, your order will not be filled.
As for the second question, yes, that's the correct format. However, without an incremental book feed and trade feed, you cannot obtain accurate backtesting results, at least, in short-term timeframes.
If you only have snapshot data, I would recommend coding a simple backtester that fills your order based on the best bid or ask. This approach would be much faster, and in longer timeframes, you can achieve a decent quality of backtesting result.
Please see https://github.com/nkaz001/algotrading-example/tree/master/crypto-scratch
I see, that makes sense re the market depth.
I think I'm still not sure about this bit Until your order is matched by the opposite best bid or ask or a trade occurs, your order will not be filled. Here I'm trying to make sure I understand the latter, where a trade occurs. When I have my sell order in @ 968 and there is a buy trade event which in the my data set was recorded @ 970, would that not fill my order (with the trade event actually happening at the improved price @968)? I'm struggling to understand the circumstances under which the 'local' agent could have a limit order filled by an incoming trade event from the exchange.
Unfortunately the exchange I am working with only provides snapshot order book data and the project we are working on will involve market making on this exchange so I was hoping to be as realistic as possible with the simulation. I'm going to take a look at your other repo . Thanks for the response, much appreciated.
I checked again, the first buy trade @ 970 at 1680591208392000 seems to happen after the cancellation request and before the new submission.
In that case, that's the best you can do for now, I'm also thinking of providing a tool to convert a series of snapshots to incremental book messages and a queue model that accounts for limited depth to support limited depth-only exchanges. But I'm still not sure if it would make a decent quality of backtesting results in that case.
OK I missed that when i was debugging, let me check again - thanks.
Re the distinction between having a series of snapshots vs. incremental updates, I understand that with incremental updates you would have more granular timestamps but is there some other reason why it would be more accurate? Is it because you can determine your position in the queue more accurately?
FYI - I ran my example with the latency set to 0, and can see fills happening when i placed orders inside the spread.
Yes, that's it. So, Market-By-Order aka L3 is the best for backtesting.
Hi - I tried another experiment based on the
simple_two_sided_quote
example, where i adjusted the bid-offer spread of the orders I submitted to be 2 ticks wide, with idea to see some fills of my own orders (see the gist below). Then I used the very small data set attached (same as last issue I raised) and noted that at the time of the first trade event (buy @ 970), I have submitted limit orders buy 966 / sell 968. First query is that when I am debugging I don't see my orders in the market depth object on the exchange side (checked the time stamps, don't think this is due to latency). Also I don't see my order getting filled, which i think would be the expected behaviour? Also note that I added depth clear events to my data set, as the data store I am dealing with just has a snapshot of the book to 10 levels at each time stamp, attached this data set below. I think this the right format? I cross checked with the Binance example.thanks
Archive.zip
https://gist.github.com/griffiri/fd5600763f80a994febdd5eed99109ed