Create a shared package that could be installed and then reused by different projects (local research notebook, different instances of v2realbot, scripts etc.) to serve as one point of fetching the data and sharing the cache.
Responsibility of this package:
accessing and managing local trade cache and remotely fetch
accesing and managing local agg cache and execute aggregation including resampling
support for stock, later for crypto
Ideas
trade store (file cache, day per file) - if not present loads from alpaca
agg data store (db or parquet daily files, start with parquet as 5mio parquet is loaded in 3s)
decide time granularity for agg file cache, ohlcv daily 1sec cbar is 700kb - optmize for this granularity (must be fast). 2 years 1s data (5.5mio) in parquet are loaded in 3s (440days). If it were daily files the overhead of opening 440 files would be immense. Optimize for speed.
support for various aggregation types
if not present aggregates from trades with vectorized aggregation and stores to cache
supports resampling (probably only highest resolution is stored)
Inspiration from this design (originally meant primarily for database. It has to be decided yet if db will be supported in the first phase, decide during implementattion):
Create a shared package that could be installed and then reused by different projects (local research notebook, different instances of v2realbot, scripts etc.) to serve as one point of fetching the data and sharing the cache.
Responsibility of this package:
Ideas
Exposed IF:
After installing the package you just configure the stores and access keys and use it within your app - and can use/reuse existing stores.
For stocks daily files always contain also extended hours, they can be filtered by API or by client)
Try to reuse v2trading cache structure - to avoid rework
For speed - optimize remote fetching and loading as suggested in this conversation.
Tasks:
Open
DB support