pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.
https://pydata.github.io/pandas-datareader/stable/index.html
Other
2.9k stars 681 forks source link

Unable to fetch Google finance data #394

Closed nasirali1 closed 6 years ago

nasirali1 commented 6 years ago

Google has changed their URL from "http://www.google.com/finance/info" to "https://finance.google.com/finance"

rsvp commented 6 years ago

OK, please tell us if and where this breaks the API.

What are the error messages with respect to pandas-datareader?

nasirali1 commented 6 years ago

Hi rsvp, Thanks for a prompt reply. Please find below the python code and error trace:

CODE: def finance_date(): """ Stock indicies for .INX = S&P 500 .DJI = Dow Jones .IXIC = NASDAQ """ nds = web.get_quote_google(['.IXIC', '.DJI', '.INX'])

ERROR TRACE: Traceback (most recent call last): File "/home/ali/IdeaProjects/Environmental-Data-Collection/core/finance/nds_stock.py", line 55, in finance_date() File "/home/ali/IdeaProjects/Environmental-Data-Collection/core/finance/nds_stock.py", line 40, in finance_date nds = web.get_quote_google(['.IXIC', '.DJI', '.INX']) File "/home/ali/IdeaProjects/Environmental-Data-Collection/venv/lib/python3.5/site-packages/pandas_datareader/data.py", line 56, in get_quote_google return GoogleQuotesReader(*args, **kwargs).read() File "/home/ali/IdeaProjects/Environmental-Data-Collection/venv/lib/python3.5/site-packages/pandas_datareader/base.py", line 72, in read return self._read_one_data(self.url, self.params) File "/home/ali/IdeaProjects/Environmental-Data-Collection/venv/lib/python3.5/site-packages/pandas_datareader/base.py", line 79, in _read_one_data out = self._read_url_as_StringIO(url, params=params) File "/home/ali/IdeaProjects/Environmental-Data-Collection/venv/lib/python3.5/site-packages/pandas_datareader/base.py", line 90, in _read_url_as_StringIO response = self._get_response(url, params=params) File "/home/ali/IdeaProjects/Environmental-Data-Collection/venv/lib/python3.5/site-packages/pandas_datareader/base.py", line 139, in _get_response raise RemoteDataError('Unable to read URL: {0}'.format(url)) pandas_datareader._utils.RemoteDataError: Unable to read URL: http://www.google.com/finance/info?q=.IXIC%2C.DJI%2C.INX

paintdog commented 6 years ago

Maybe it makes sense to contact Google to ask for a correction, but I do not think that the issue affects only a few.

nasirali1 commented 6 years ago

paindog, Google has changed/restructured their finance urls. It's not a mistake that we can ask Google to fix.

paintdog commented 6 years ago

They reduced the available data to the period of about one year. That's a problem they could fix...

nasirali1 commented 6 years ago

paintdog, I think, this is a simple URL update in pandas-datareader API. Current URL used in api is "http://www.google.com/finance/info". This URL should be updated to new Google finance URL: "https://finance.google.com/finance".

API code snippet (data.py) class GoogleQuotesReader(_BaseReader):

"""Get current google quote"""

@property
def url(self):
    return 'http://www.google.com/finance/info'
minimalgeek commented 6 years ago

Temporary and very ugly solution for testing purposes:

from pandas_datareader.google.daily import GoogleDailyReader

@property
def url(self):
    return 'http://finance.google.com/finance/historical'

GoogleDailyReader.url = url

# get data

import pandas_datareader as pdr
from datetime import datetime

start = datetime(2010,1,1)
end = datetime(2014,1,1)
ret = pdr.get_data_google(['AAPL'], start, end)
sof commented 6 years ago

Changing the URL works, but the 302 redirect that www.google.com/finance/historical returns will currently strip the startdate= and enddate= params, which is why people are seeing only the past year's data being returned.

Unintentional bug in the www.google.com redirect rule at play?

TyrelCB commented 6 years ago

the commit added by soon fixed the historical collection, however the delayed quote is still not functional, I've tried http://finance.google.com/finance with no success

rsvp commented 6 years ago

@TyrelCB The latest PR by @soon has failed Travis build: https://github.com/pydata/pandas-datareader/pull/402

rsvp commented 6 years ago

Google Finance under renovation, portfolios to be deprecated mid-November 2017 : https://github.com/rsvp/fecon235/issues/7#issuecomment-332572738

coulanuk commented 6 years ago

This fix may be a waste of effort. It looks like there are big holes in the underlying datasets on Google Finance, eg Bank of America (BAC) or JP Morgan (JPM) have nothing or garbage prices (as of the date I am posting).

davidastephens commented 6 years ago

Fixed in #404.