When running inference right now, we need to download a lot of data every time the model runs, which costs about a minute. This is far too long. The reason it takes so long right now is because we download with quite a high granularity, and we are limited to 5000 samples per request.
Ideas for fix:
Retrieve all samples in parallel, rather than sequentially per 5000-block.
Probably best to implement multiple improvements to speed this up:
[ ] (re)Use cached data for retrieval. Could do one initial retrieval, and then use a stream for any new datapoints.
[ ] Retrieve with lower granularity -- only retrieve what we need. If we use a preset granularity, and use a start date, we can enter the exact start time, and receive the exact required samples. Does need some extra attention to deal with weekends and times that trading is not possible.
[ ] Retrieve samples in parallel. This will require some multithreading/processing library.
When running inference right now, we need to download a lot of data every time the model runs, which costs about a minute. This is far too long. The reason it takes so long right now is because we download with quite a high granularity, and we are limited to 5000 samples per request.
Ideas for fix: