pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.
https://pydata.github.io/pandas-datareader/stable/index.html
Other
2.92k stars 680 forks source link

Issues with the data reader fetching yahoo finance #315

Closed Crowbeezy closed 7 years ago

Crowbeezy commented 7 years ago

Apologies first issue/comment on GitHub. I will review proper protocol. Please correct me if this is not the correct place to put this.


RemoteDataError Traceback (most recent call last)

in () 4 end = dt.datetime(2017, 5, 8) 5 ----> 6 INPX = data.DataReader(INPX ,'yahoo', start, end) 7 8 #Convert Volume from Int to Float C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\data.py in DataReader(name, data_source, start, end, retry_count, pause, session) 92 adjust_price=False, chunksize=25, 93 retry_count=retry_count, pause=pause, ---> 94 session=session).read() 95 96 elif data_source == "yahoo-actions": C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\yahoo\daily.py in read(self) 75 def read(self): 76 """ read one data from specified URL """ ---> 77 df = super(YahooDailyReader, self).read() 78 if self.ret_index: 79 df['Ret_Index'] = _calc_return_index(df['Adj Close']) C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\base.py in read(self) 176 df = self._dl_mult_symbols(self.symbols.index) 177 else: --> 178 df = self._dl_mult_symbols(self.symbols) 179 return df 180 C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\base.py in _dl_mult_symbols(self, symbols) 195 if len(passed) == 0: 196 msg = "No data fetched using {0!r}" --> 197 raise RemoteDataError(msg.format(self.__class__.__name__)) 198 try: 199 if len(stocks) > 0 and len(failed) > 0 and len(passed) > 0: RemoteDataError: No data fetched using 'YahooDailyReader'
rgkimball commented 7 years ago

Can you provide a sample that replicates your issue? This works for me:

start = datetime(2016, 12, 31)
end = datetime.now()
INPX = data.DataReader('INPX', 'yahoo', start, end)
Crowbeezy commented 7 years ago

On Thu, May 11, 2017 at 9:49 PM, Rob Kimball notifications@github.com wrote:

start = datetime(2016, 12, 31) end = datetime.now() INPX = data.DataReader('INPX', 'yahoo', start, end)

I'm not sure what the problem was. Perhaps using 'end = datetime.today()'?

You're code worked. Thank you for your help!

benpillet commented 7 years ago

From my requirements.txt:

pandas-datareader==0.4.0
pandas==0.20.1

and in python shell:

from datetime import *
import pandas_datareader.data as data
start = datetime(2016, 12, 31)
end = datetime.now()
INPX = data.DataReader('INPX', 'yahoo', start, end)

with error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/data.py", line 117, in DataReader
    session=session).read()
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/yahoo/daily.py", line 77, in read
    df = super(YahooDailyReader, self).read()
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 157, in read
    params=self._get_params(self.symbols))
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 74, in _read_one_data
    out = self._read_url_as_StringIO(url, params=params)
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 85, in _read_url_as_StringIO
    response = self._get_response(url, params=params)
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 120, in _get_response
    raise RemoteDataError('Unable to read URL: {0}'.format(url))
pandas_datareader._utils.RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv?f=2017&ignore=.csv&b=31&c=2016&g=d&a=11&d=4&s=INPX&e=16

I have a feeling yahoo updated their endpoint to be something else. I get a 502 when I try curl too. The link at https://finance.yahoo.com/quote/SPY/history?p=SPY points to https://query1.finance.yahoo.com/v7/finance/download/SPY?period1=1492372898&period2=1494964898&interval=1d&events=history&crumb=MLOX17FWABw

benpillet commented 7 years ago

Looks like there's also a cookie that needs to be sent in order to avoid a 401 Unauthorized. https://www.elitetrader.com/et/threads/yahoo-historical-data-did-they-change-the-url-recently.309554/

rgkimball commented 7 years ago

Not sure if the icharts failure is a temporary problem, but I submitted a WIP PR (above) to replace the request structure. Even if icharts does come back online, may be a good idea to implement a backup.

IvanTrendafilov commented 7 years ago

I don't have the time to fix this in the library, but, essentially, there is another API endpoint that one can use. It's https://query1.finance.yahoo.com. But it requires a matching cookie and crumb to use it. I wrote a little PhantomJS script to get it, whilst I was working on it: https://github.com/IvanTrendafilov/YahooFinanceAPITokens

It can be useful to someone who needs to create a URL that they want to query automatically.

You can also get a valid cookie / crumb combination from the Chrome dev tools in the Network tab.

IvanTrendafilov commented 7 years ago

They've confirmed icharts isn't coming back, so @rgkimball's patch should certainly go in.

https://forums.yahoo.net/t5/Yahoo-Finance-help/Is-Yahoo-Finance-API-broken/td-p/250503/page/3

bkcollection commented 7 years ago

@rgkimball @IvanTrendafilov , can the fix be pip install upgrade for the ease for beginner?

Franlodo commented 7 years ago

Yahoo has change the URL, and the way the use date. Now date are Unixtime. For example to get historic cvs from AAPL : https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1492510098&period2=1495102098&interval=1d&events=history&crumb=ydacXMYhzrn

period1 or period2 is date in (unixtime = (Human time - 25568) 86400) but you must check your timezone, for example my place is Europe, then I have UTC+2 and I must sustract 7200 seconds. So my formula is (Human time - 25568) 86400) -7200; where Human time is the time (d/mm/yyyy), 25568 is the number of days from 01/01/1900 till 01/01/1970 (This is because i do it in Excel and this is the minimun date), 86400 are the seconds in a day and 7200 are the number of seconds in my 2 hours difference with UTC

Interval is day, week or month Events history is historic data prices, div|split&filter=split, for splits and div|split&filter=div for dividens Crumb is the cookie ... I don't really know how it works, but I have with the same since monday.

I'm using this for update my data in Excel and it works and now I don't need to wait until morning to get the historical data because it's available at less an hour after the market close (I´m talking about american markets)

I apologyze for not to be fluent in english.

I hope this help

bkcollection commented 7 years ago

@Franlodo DO you try do download using the new link to download like 1000 stocks, will it get blocked? The old API seems has no limitation but I am curious if the new one still allow that. Hope you can try on it to validate

Franlodo commented 7 years ago

The link is to get the csv file in the web, it must run for 2 or for 2000; In my Excel file I have nearly 200 and run properly.

You can get 1000 of csv files and import from pandas, it will be the same made it saving files or "in the air" I had post here because the error reported for pandas-datareader was the url

Anyway, I will try and comment.

rgkimball commented 7 years ago

@bkcollection This will be available once the bugs are ironed out and the pull request is merged into the main repository.

bkcollection commented 7 years ago

@rgkimball how optimistic the bugs can be fixed?

rgkimball commented 7 years ago

@bkcollection The latest commits of #331 about wraps it up. It's a little frustrating that the new API drops out periodically, but you can now pull any historical price range, splits and dividends. I haven't found a new interface for Yahoo Options - this may be permanently removed, but I'm happy to implement if someone finds the endpoint. Same thing for index constituents.

All of the failing tests on my PR are due to Eurostat or Yahoo's API sporadically failing. Notice that there is inconsistency across different tests for which ones are failing - appears to be random. Hoping now some other people find time pull down the code and give it a thorough review before the maintainers make a decision on merging it in.

bkcollection commented 7 years ago

How to get the implement the fix into my current module? How is the drop out happen? Can you explain more?

gusutabopb commented 7 years ago

@bkcollection:

For a temporary fix (until this PR gets merged), try:

$ git clone https://github.com/rgkimball/pandas-datareader
$ cd pandas-datareader
$ git checkout fix-yahoo
$ pip install -e .

On Python:

import pandas_datareader as pdr
print(pdr.__version__)  # Make sure it is '0.4.1'.

I originally wrote this as an answer to this Stackoverflow question

helhadry commented 7 years ago

@gusutabopb The latest version is "0.4.0" not it?

gusutabopb commented 7 years ago

@hmz123 The latest on master/pypi is 0.4.0. The PR proposed by @rgkimball makes it 0.4.1. The above instructions are for those who do not want to / can't wait for an official upgrade. See the commits of the fix-yahoo patch branch here: https://github.com/rgkimball/pandas-datareader/commits/fix-yahoo

bkcollection commented 7 years ago

seems like still have some errors. Not sure when will be a fixed release 0.4.1. Hopefully in this week.

arose13 commented 7 years ago

@bkcollection what do you mean by some errors? Do you mean being blocked by yahoo if you request a handful of symbols?

bkcollection commented 7 years ago

@arose13 you have blocked for using too many symbols? How many is it?

arose13 commented 7 years ago

@bkcollection 4-6 in 10 mins got blocked for ~12h

bkcollection commented 7 years ago

@arose13 That is terrible. I would like to download like 1000 tickers. 4 to 6 tickers and get blocked is by well not working to me

@Franlodo Are you able to download 200 tickers as you said previously?

Franlodo commented 7 years ago

Now I'm downloading 6750 tickers ... all the list in Nasdaq website; in monthly, weekly and daily. I'm not using data-reader till they finish the update, but I use pandas

Thats the URL I use:

url = ("https://query1.finance.yahoo.com/v7/finance/download/" +tick+"?period1="+start+"&period2=" +end+"&interval="+tf+"&events=history&crumb=Put your own crumb") Where tick is obvius like start, end and tf.

I ask with POST method, and apply to my dataFrame to_numeric function.

Data in Yahoo is not as good it used to be. For example in AAPL if apply the stock split you will see that close is the same value than low .... People here are working to improve a solution and the problem is yahoo that seem they are broke their data.

bkcollection commented 7 years ago

@Franlodo Thanks. Looks like it works as previously @rgkimball Is there any schedule that your fix will be in 0.4.1 soon?

cpankaj commented 7 years ago

I was able to fetch data from Yahoo Finance and implemented the fix for ruby https://github.com/cpankaj/yquotes

Solution for 401/unauthorized is to get fresh cookie and crumb and retry

bkcollection commented 7 years ago

@rgkimball any schedule that your fix for yahoo will be in 0.4.1?

helhadry commented 7 years ago

Hi, Someone can test this,i don't have my computer with me https://github.com/ranaroussi/fix-yahoo-finance

bkcollection commented 7 years ago

@hmz123 python 2.7 not supported. Seems like have to wait 0.4.1 which have no schedule yet.

deios0 commented 7 years ago

Unfortunately, historical data from http://ichart.finance.yahoo.com/table.csv is no longer working. But the thing with finding and keeping crumbs is not obvious and also since it's non-official, I guess it could be changed too. That's why I tried to find free or at least cheap alternative and found this one: https://eodhistoricaldata.com/.

It provides the data in plain CSV, exactly the same format as Yahoo Finance, then it's easy to switch, just by replacing URLs.

femtotrader commented 7 years ago

EOD Historical Data wrote a blog post https://eodhistoricaldata.com/knowledgebase/adapt-old-yahoo-scripts-eod-historical-data/ but it's only available for free with AAPL.US symbol... for others symbol it's a paid API

ehtom commented 7 years ago

Has anyone else noticed that the data quality is significantly worse than before? There seem to be a number of large gaps (filled with 'null') for several tickers. Data is flat out wrong in some cases (e.g. 20% daily moves in low, for example)...

This seems to also be a problem for the new yahoo finance website (look at BA.L, for example).

Does anyone know of any other high quality free sources of EOD data, or of reliable paid-for data?

femtotrader commented 7 years ago

Here is a small project for downloading data from EOD historical data

jotbe commented 7 years ago

What about Alphavantage as a data provider? I tinkered a bit with their API, it is a bit slow sometimes and I didn't check the quality of their data yet. They offer plenty of indicators and even (near-)realtime data as JSON. I didn't find out yet, who is offering this service.

deios0 commented 7 years ago

@jotbe I tried them 1-2 years ago, quality is not good and data not updated regular.

femtotrader commented 7 years ago

Here is some code to download data from Alphavantage (only intraday, daily, weekly, monthly data - no technical indicators...)

justinlent commented 7 years ago

+1 for this as a reasonable option/implementation

Sent from iPhone


From: Jan Beilicke notifications@github.com Sent: Saturday, June 3, 2017 7:54:34 AM To: pydata/pandas-datareader Cc: Justin Lent; Manual Subject: Re: [pydata/pandas-datareader] Issues with the data reader fetching yahoo finance (#315)

What about Alphavantagehttp://www.alphavantage.co/ as a data provider? I tinkered a bit with their API, it is a bit slow sometimes and I didn't check the quality of their data yet. They offer plenty of indicators and even (near-)realtime data as JSON.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/pydata/pandas-datareader/issues/315#issuecomment-305979892, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGPO3gIlNRbzICH8uyzpdkdk-YRR0QSJks5sAXOqgaJpZM4NUwZN.

deios0 commented 7 years ago

@justinlent why not to use https://eodhistoricaldata.com/. They have nice data for cheap prices. Do you have any concerns here?

jainraje commented 7 years ago

FYI, I'm using https://github.com/ranaroussi/fix-yahoo-finance. the historical data download appears to work quite well.

To my surprise (or maybe not so much these days) Yahoo Finance did not return any historical data for certain key US tickers. For example: CAT, COP, GRMN and many others. Posting this mostly as FYI and to see if any others are able to download CAT, COP or GRMN historic data.

Perhaps yahoo finance engineers broke a few things (data quality, availability of certain tickers, other?) as they migrated to the new API?

justinlent commented 7 years ago

Agreed — that EOD data works very well too. Was more meaning as something that’s free, the alphavantage looks decent and is easy and straightforward for someone looking for data to play around with and doesn’t want to commit to a constant monthly subscription

On Sat, Jun 3, 2017 at 11:38 AM, Denis Alaev notifications@github.com wrote:

@justinlent https://github.com/justinlent why not to use https://eodhistoricaldata.com/. They have nice data for cheap prices. Do you have any concerns here?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/pandas-datareader/issues/315#issuecomment-305993540, or mute the thread https://github.com/notifications/unsubscribe-auth/AGPO3lHzo_kWHrEeQTkbeJ_4s_n4i3Bqks5sAagPgaJpZM4NUwZN .

bkcollection commented 7 years ago

What country akpha vantage support?

On Jun 4, 2017 3:35 AM, "Justin Lent" notifications@github.com wrote:

Agreed — that EOD data works very well too. Was more meaning as something that’s free, the alphavantage looks decent and is easy and straightforward for someone looking for data to play around with and doesn’t want to commit to a constant monthly subscription

On Sat, Jun 3, 2017 at 11:38 AM, Denis Alaev notifications@github.com wrote:

@justinlent https://github.com/justinlent why not to use https://eodhistoricaldata.com/. They have nice data for cheap prices. Do you have any concerns here?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/pandas-datareader/issues/315# issuecomment-305993540, or mute the thread https://github.com/notifications/unsubscribe- auth/AGPO3lHzo_kWHrEeQTkbeJ_4s_n4i3Bqks5sAagPgaJpZM4NUwZN .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/pandas-datareader/issues/315#issuecomment-305996700, or mute the thread https://github.com/notifications/unsubscribe-auth/ARtuA6W0kHhTeE8wYpkSZPJkf-Sbq0YYks5sAbWQgaJpZM4NUwZN .

deios0 commented 7 years ago

@jotbe "They offer plenty of indicators and even (near-)realtime data as JSON. I didn't find out yet, who is offering this service." Btw, you are really need technical indicators with API? How do you use them?

krinkere commented 7 years ago

At this point instead of waiting or trying to fix yahoo, i switched to quandl... at least for my purposes it does the job.

adminho commented 7 years ago

I followed https://github.com/ranaroussi/fix-yahoo-finance. It's work

deios0 commented 7 years ago

Yes, but they write there: "a temporary fix". I think soon you'll get the same problems with Yahoo.

javadba commented 7 years ago

fyi: the fix from ranaraoussi requires python 3.4+

liuyigh commented 7 years ago

@javadba a python 2.7-compatible pull request was merged 5 days ago. FYI.

aisthesis commented 7 years ago

rgkimball's version 0.4.1 is still working for me. Is there a reason why it isn't being merged?

jreback commented 7 years ago

it needs to pass tests and respond to comments

rgkimball commented 7 years ago

I believe comments have been covered - I will try to work on tests next week. Did I miss anything?

Rob Kimball Sent from my mobile device.

On Jun 28, 2017, at 2:29 PM, Jeff Reback notifications@github.com wrote:

it needs to pass tests and respond to comments

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.