jhirschibar / LittleJohn

Project to discover a deterministic value policy useful in managing a portfolio of short option spreads
5 stars 0 forks source link

Update recent Stock/Options data from Polygon #21

Open jhirschibar opened 1 month ago

jhirschibar commented 1 month ago

Pull data from the last year for stocks and options. (not been refreshed since 2023)

Question: what do we do for stocks where there has been a corporate action?

karlie15 commented 1 month ago

DoD

karlie15 commented 4 weeks ago

UPDATE:

jhirschibar commented 3 weeks ago

In order to convert the quote data to useable price data, we require significant compression. The compression algorithm will work as follows:

The challenge will be to make this O(N) time Will do with one file and see how big of a file it creates.

Or it can be an ETL step for the flat file: eg, upload the entire csv to postgres table, compress via a query, upload to a new table, drop the old table. This would likely be better than the python work

jhirschibar commented 3 weeks ago

Note, when repulling data, only make directories/files for options contracts that have results. Lots of empty files cluttering the file system

jhirschibar commented 2 weeks ago

Note, when repulling data, only make directories/files for options contracts that have results. Lots of empty files cluttering the file system

This is complete with issue #31 and #40 pr.

Read to delete tables, clear files, and re-pull data.

jhirschibar commented 2 weeks ago

Need to make sure that Historical prices for options don't make files for contracts that never trade. And the path runners need to know what to crawl. If no file/directory is made for a contract, it needs to know to pass over it

jhirschibar commented 2 weeks ago

This ticket has morphed to include refactoring the quote data so that we can backfill without timeouts. It involves making an abstraction of the aiomultiprocess pool, worker, and scheduler classes customized to the quote process

jhirschibar commented 2 weeks ago

kill pill to trigger the TTL if there is a new ticker in the queue for the QuoteWorker second kill pill if there is 16 consecutive queries that return non results. If this occurs, the set running = False, and go and perform the queue_cleanup() to go and remove all remaining tasks that involve that o_ticker. Then let the worker expire.

This will require the o_ticker to be attached to the args in an easily extractable fashion. It will also require the args to be ordererd by oticker so that they work with the QuoteScheduler

Also, update the uploader to only make args for directories/files that exist