Open garyttierney opened 3 months ago
I conducted a quick test and obtained the following results:
I have some concerns about whether this is the right direction. The approach introduces an additional copy. Moreover, the original daily loading method was designed to handle multiple days of data within limited memory by loading data one by one. With the new suggestion, there is a risk of exceeding memory capacity if data consumption doesn't keep pace with how quickly it is enqueued into the bus.
Without the strategy implementation, the test uses only elapse(100ms). I will include a test with more intensive data, such as BTCUSDT.
I have some concerns about whether this is the right direction. The approach introduces an additional copy.
That is simply a limitation of the bus
API in use. Replacing this with a ring-buffer is on the todo list above and gets the readers back to zero-copy and good cache coherence.
With the new suggestion, there is a risk of exceeding memory capacity if data consumption doesn't keep pace with how quickly it is enqueued into the bus.
The queue is a fixed size, so there's no risk of exceeding memory capacity. Although it should be loading incrementally by copying chunks of Event
s out of the file at a time, also on the todo list.
Without the strategy implementation, the test uses only elapse(100ms). I will include a test with more intensive data, such as BTCUSDT.
Can you share this test? It'd be useful to put in a benchmark as I iterate.
Even though the queue implementation is lock-free, doesn't introducing an atomic value to check items in the producer/consumer potentially trigger cache invalidation, adding another layer of overhead?
I used the Rust version of the grid trading backtest example. It would be beneficial to have two benchmarks: one with and one without the strategy implementation. Using the BTCUSDT data from the here provided to ensure the benchmarks are aligned.
Working on replacing the bus with a ring buffer that eliminates the copies now. I think we can get away with very little ato mic usage on x64, references:
Remove the
read_data()
calls within the backtest implementations and replace them withrecv()
calls on a lock-free queue. This avoids the pause that happened previously when a backtest reaches the end of the current periods data and begins loading the next file. With this model, the data for the next period should be available by the time the previous one finishes.There are still a couple of improvements that need to be made here:
Clone
is unnecessary, readers could easily accept a reference toEvent
butBusReader
doesn't give out referencesDataSource::Data
is currently unsupported because it is notSend
orSync
There are also a few peculiarities in the implementation like having
peek
soinitialize_data
can be trivially implemented, I'd like to see about restructuring this.Remaining todo items
bus
with a simple circular queueNpyReader<R, T>
that yields singleT
items from a readerR
DataSource::Data
workEventConsumer
/TimestampedEventQueue
/etc. traits that were introduced to reduce implementation effort.