kamilkrukowski / sec-sentiment-pred

0 stars 1 forks source link

```generate_labels.py``` verbose output #8

Closed kamilkrukowski closed 1 year ago

kamilkrukowski commented 1 year ago

Currently, whenever the label_generation is run for a non-trivial number of inputs, the terminal is flooded by

[*********************100%***********************] 1 of 1 completed occurring hundreds of times. Can this be silenced? Could we get a perfectly normal tqdm loading bar instead?

------EDIT-----------

It really would help to know whether this function will take 15 seconds or 20 minutes. As the dataset size increases, the range of possible runtimes will only widen.

yyitingli commented 1 year ago

this could be silence if you set yf.download( tikr, progress= False)

kamilkrukowski commented 1 year ago

Does yf.download actually download new data? I thought the code should re-use TIKRS_Data.pkl?

------------EDIT-------------- Yes of course it downloads new data. But why does it run every time I run the script? @kelseyyew I thought it caches and uses TIKRS_Data.pkl?

kelseyyew commented 1 year ago

the filename is called 'TIKR_DATA.pickle' but it should download if 'TIKR_DATA.pickle' does not exist. Once it exists, it will download the pickle file instead of redownloading it.

kamilkrukowski commented 1 year ago
  1. Sorry, what do you mean download the local pickle file?

  2. It seems we are at least downloading some ticker called "^GSPC"

Screenshot 2023-03-13 at 1 25 13 PM

Okay, it turns out that the terminal was reprinting the download message dozens of times due to a bug from tqdm in the version of yfinance I had installed. I've updated tqdm and I'm down to 1 download message.

kelseyyew commented 1 year ago

^GSPC is the S&P 500. I can make edits so we only need to download that once. The pickle file only needs to be created once. This is ALL historical price data for all tikrs. Once that is done, then we can use the pickle file to get historical data.