robertmartin8 / MachineLearningStocks

Using python and scikit-learn to make stock predictions
MIT License
1.74k stars 506 forks source link

Historical Fundamental Data #10

Closed GatorByte closed 6 years ago

GatorByte commented 6 years ago

Robert, Just discovered your MachineLearningStocks. Not an issue but a suggestion on fundamental data sources, The American Association of Individual Investors has a product (Stock Investor Pro) with a reasonable subscription fee of US $198/year after a membership fee of $29/year. A subscriber has access to both current and weekly non survivorship biased historical back to 2004 for ~2000 fundamental factors for ~6000 equities. It takes a significant effort to download and put the data into a usable format. I have been using this data source in a personal Python based stock back tester and screener for personal investing for 14 + years. Interestingly I too am wading through Eremenko Krill's Machine Learning and Deep Learning and have just purchased a GPU card with the long term intent of adding ML stock selection to my current system.

robertmartin8 commented 6 years ago

Hi, thanks for the info! I always appreciate finding out about different data sources, as it's good to keep one's options open. Out of curiosity, may I ask:

My current free data source (found by scraping various sources on the internet) has data back to the year 2000 for about 6000 equities (though only 50 features). And it's got lots of missing data with no survivorship correction. So it's definitely inferior to your suggestion, but I guess that's life if you're not paying for data. However, I've found a way to download the numbers from all US annual reports into csv directly from the official API (free), so at some stage I might just scrape that database and see what I can do.

Hope you find Kirill's course enjoyable: I've found that it's very useful for giving you an overview of what's out there, but it doesn't go into much depth on particular algorithms.