Ben-Sherman / quarterly-earnings-machine-learning-algo

A Commission-Free Algo Trading Bot By Machine Learning Company SEC Filing Language
38 stars 18 forks source link

keep track of the last filing downloaded, and resume from there #6

Closed FlorinAndrei closed 3 years ago

FlorinAndrei commented 4 years ago

Sometimes download_raw_html.py gets stuck on some download before it completes the list.

I've added a log of completed downloads. Before running, the script looks into the log (if it exists) and determines the index of the last successful download. It will then skip all completed items and go to the next one.

If master.tsv is rebuilt between runs, then the log needs to be thrown away since it depends on the contents of master.tsv.