ScottfreeLLC / AlphaPy

Python AutoML for Trading Systems and Sports Betting
Apache License 2.0
1.11k stars 201 forks source link

Google Daily Data Restrictions #11

Closed mrconway closed 6 years ago

mrconway commented 6 years ago

It now looks like Google daily data history is being limited to one year. We are rapidly running out of options for both free daily and intraday market data. To effectively train market models, we need as much historical data as possible. If anyone has any suggestions for alternate feeds (e.g., Quandl), please post them here.

finetrade commented 6 years ago

http://francescopochetti.com/scrapying-around-web/ http://francescopochetti.com/pythonic-cross-validation-time-series-pandas-scikit-learn/ http://francescopochetti.com/stock-market-prediction-part-introduction/

def getStock(symbol, start, end): """ Downloads Stock from Yahoo Finance. Computes daily Returns based on Adj Close. Returns pandas dataframe. """ df = web.DataReader(symbol, 'yahoo',start, end)

df.columns.values[-1] = 'AdjClose'
df.columns = df.columns + '_' + symbol
df['Return_%s' %symbol] = df['AdjClose_%s' %symbol].pct_change()

return df

def getStockFromQuandl(symbol, name, start, end): """ Downloads Stock from Quandl. Computes daily Returns based on Adj Close. Returns pandas dataframe. """ import Quandl df = Quandl.get(symbol, trim_start = start, trim_end = end, authtoken="your token")

df.columns.values[-1] = 'AdjClose'
df.columns = df.columns + '_' + name
df['Return_%s' %name] = df['AdjClose_%s' %name].pct_change()

return df

def getStockDataFromWeb(fout, start_string, end_string): """ Collects predictors data from Yahoo Finance and Quandl. Returns a list of dataframes. """ start = parser.parse(start_string) end = parser.parse(end_string)

nasdaq = getStock('^IXIC', start, end)
frankfurt = getStock('^GDAXI', start, end)
london = getStock('^FTSE', start, end)
paris = getStock('^FCHI', start, end)
hkong = getStock('^HSI', start, end)
nikkei = getStock('^N225', start, end)
australia = getStock('^AXJO', start, end)

djia = getStockFromQuandl("YAHOO/INDEX_DJI", 'Djia', start_string, end_string) 

out =  pd.io.data.get_data_yahoo(fout, start, end)
out.columns.values[-1] = 'AdjClose'
out.columns = out.columns + '_Out'
out['Return_Out'] = out['AdjClose_Out'].pct_change()

return [out, nasdaq, djia, frankfurt, london, paris, hkong, nikkei, australia]
mrconway commented 6 years ago

Thank you. We'll test this out and add Quandl as a data source as well.

mrconway commented 6 years ago

Google Finance URL is acting flaky, even for daily data. Unfortunately, all free sources of financial data are disappearing.

mrconway commented 6 years ago

Please refer to #15.