jakevdp / PythonDataScienceHandbook

Python Data Science Handbook: full text in Jupyter Notebooks
http://jakevdp.github.io/PythonDataScienceHandbook
MIT License
43.2k stars 17.93k forks source link

Could not access to Google data #125

Open camilogavo opened 6 years ago

camilogavo commented 6 years ago

Hi there,

I am trying to download Google stock prices time serie following the HandBook but somethig weird happens:

code I typed

from pandas_datareader import data goog=data.DataReader('GOOG',start='2004',end='2016',data_source='google')


RemoteDataError Traceback (most recent call last)

in () ----> 1 goog=data.DataReader('GOOG',start='2004',end='2016',data_source='google') ~/anaconda3/lib/python3.6/site-packages/pandas_datareader/data.py in DataReader(name, data_source, start, end, retry_count, pause, session, access_key) 135 chunksize=25, 136 retry_count=retry_count, pause=pause, --> 137 session=session).read() 138 139 elif data_source == "enigma": ~/anaconda3/lib/python3.6/site-packages/pandas_datareader/base.py in read(self) 179 if isinstance(self.symbols, (compat.string_types, int)): 180 df = self._read_one_data(self.url, --> 181 params=self._get_params(self.symbols)) 182 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT']) 183 elif isinstance(self.symbols, DataFrame): ~/anaconda3/lib/python3.6/site-packages/pandas_datareader/base.py in _read_one_data(self, url, params) 77 """ read one data from specified URL """ 78 if self._format == 'string': ---> 79 out = self._read_url_as_StringIO(url, params=params) 80 elif self._format == 'json': 81 out = self._get_response(url, params=params).json() ~/anaconda3/lib/python3.6/site-packages/pandas_datareader/base.py in _read_url_as_StringIO(self, url, params) 88 Open url (and retry) 89 """ ---> 90 response = self._get_response(url, params=params) 91 text = self._sanitize_response(response) 92 out = StringIO() ~/anaconda3/lib/python3.6/site-packages/pandas_datareader/base.py in _get_response(self, url, params, headers) 137 if params is not None and len(params) > 0: 138 url = url + "?" + urlencode(params) --> 139 raise RemoteDataError('Unable to read URL: {0}'.format(url)) 140 141 def _get_crumb(self, *args): RemoteDataError: Unable to read URL: http://www.google.com/finance/historical?q=GOOG&startdate=Jan+01%2C+2004&enddate=Jan+01%2C+2016&output=csv I read about other issues submitted before and using data_source='yahoo', it works. I'll appreciate your help. Thanks
astrojuanlu commented 6 years ago

Alternative: https://support.google.com/docs/answer/3093281

edesz commented 6 years ago

Have a look at the deprecations notice as of Jan 24, 2018.

One of the other connectors should be used. eg. using IEX

import pandas_datareader.data as web
from datetime import datetime

start = datetime(2015, 3, 2)
end = datetime(2018, 1, 31)

df = web.DataReader('GOOG', 'iex', start, end)

df.head()
              open      high     low    close   volume
date                                                  
2015-03-02  560.53  572.1500  558.75  571.340  2123796
2015-03-03  570.45  575.3900  566.52  573.640  1700084
2015-03-04  571.87  577.1100  568.01  573.370  1871694
2015-03-05  575.02  577.9100  573.41  575.330  1385818
2015-03-06  574.88  576.6799  566.76  567.685  1654561

df.tail()
               open     high      low    close   volume
date                                                   
2018-01-25  1172.53  1175.94  1162.76  1170.37  1480540
2018-01-26  1175.08  1175.84  1158.11  1175.84  2018755
2018-01-29  1176.48  1186.89  1171.98  1175.58  1378913
2018-01-30  1167.83  1176.52  1163.52  1163.69  1556346
2018-01-31  1170.57  1173.00  1159.13  1169.94  1538688

Note that there are limitations on how much historical data can be acquired - for IEX, see here.

To generate the above code, I used a system with the following specs

pandas==0.23.4
pandas-datareader==0.7.0
Python 2.7.15rc1
astrojuanlu commented 5 years ago

(Coming back to this issue every now and then)

To get more complete historical data, you can use =GOOGLEFINANCE("GOOG", "all", DATE(2004,1,1), DATE(2018,12,31), "DAILY") in Google Sheets.

nyck33 commented 1 year ago

@astrojuanlu how's that supposed to help here with Pandas?

nyck33 commented 1 year ago

Try changing google to 'stooq'

import pandas_datareader.data as web
from datetime import datetime

start = datetime(2004, 1, 1)
end = datetime(2016, 12, 31)

print(start, end)
start
end
df = web.DataReader('GOOG', 'stooq', start, end)
goog = df.copy()
goog.head(10)

Note those were Jupyter cells where I was checking output

blogscot commented 2 months ago

I was able to load some data using @nyck33's suggestion. However, it was very easy just to go to Yahoo Finance select the 'MAX' date range and download the CSV file to my drive.