pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.
https://pydata.github.io/pandas-datareader/stable/index.html
Other
2.94k stars 681 forks source link

Response format from Yahoo seems to have changed I keep getting this error. #952

Open yeison opened 1 year ago

yeison commented 1 year ago

File "/Users/yeison/miniforge3/envs/tf-metal-0.6.0/lib/python3.10/site-packages/pandas_datareader/yahoo/daily.py", line 153, in _read_one_data data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"] TypeError: string indices must be integers

DevAleksei commented 1 year ago

As work around you may try to use solution from https://pypi.org/project/yfinance/ title pandas_datareader override

ivofernandes commented 1 year ago

They started to encrypt the stores, looks like they encrypted with AES or something like that and now stores values comes as a string, not sure how to solve that

raphi6 commented 1 year ago

Is there some way that I can implement a temporary solution using yfinance as an override and pull request it ?

DenisOd91 commented 1 year ago

Is there some way that I can implement a temporary solution using yfinance as an override and pull request it ?

>>> from pandas_datareader import data as pdr
>>> import yfinance as yf
>>> yf.pdr_override()
>>> y_symbols = ['SCHAND.NS', 'TATAPOWER.NS', 'ITC.NS']
>>> from datetime import datetime
>>> startdate = datetime(2022,12,1)
>>> enddate = datetime(2022,12,15)
>>> data = pdr.get_data_yahoo(y_symbols, start=startdate, end=enddate)

Source on StackOverflow

raphi6 commented 1 year ago

Is there some way that I can implement a temporary solution using yfinance as an override and pull request it ?

>>> from pandas_datareader import data as pdr
>>> import yfinance as yf
>>> yf.pdr_override()
>>> y_symbols = ['SCHAND.NS', 'TATAPOWER.NS', 'ITC.NS']
>>> from datetime import datetime
>>> startdate = datetime(2022,12,1)
>>> enddate = datetime(2022,12,15)
>>> data = pdr.get_data_yahoo(y_symbols, start=startdate, end=enddate)

Source on StackOverflow

Oh yeah I have used this but I meant fixing pandas-datareader as I have already submitted the code for my dissertation so I can't change my code directly, the only way I can change my code is by changing the library hehe. sorry for confusion

raphi6 commented 1 year ago

They started to encrypt the stores, looks like they encrypted with AES or something like that and now stores values comes as a string, not sure how to solve that

What are the stores? I will try anything to fix this as I have just submitted my interim dissertation and without this library, my code has no data :/ due to the nature of academia I dont think asking if the marker can override with yfinance will work

DevAleksei commented 1 year ago

What are the stores? I will try anything to fix this as I have just submitted my interim dissertation and without this library, my code has no data :/ due to the nature of academia I dont think asking if the marker can override with yfinance will work

Then you are the best candidate to fix this library. j["context"]["dispatcher"]["stores"] response contains some encrypted data instead of plain object, so you may take a look at how this data is handled in yfinance, and migrate the code. Good luck.

raphi6 commented 1 year ago

What are the stores? I will try anything to fix this as I have just submitted my interim dissertation and without this library, my code has no data :/ due to the nature of academia I dont think asking if the marker can override with yfinance will work

Then you are the best candidate to fix this library. j["context"]["dispatcher"]["stores"] response contains some encrypted data instead of plain object, so you may take a look at how this data is handled in yfinance, and migrate the code. Good luck.

With a stroke of magical programmer luck I have managed to migrate some code from yfinance where now the data is decrypted and I have tested on a small number of stocks I am able to get their data.

         I changed code in: pandas_datareader/yahoo/daily.py   Just adding a decryption function and making sure data is 
         handled correctly like before .

It took me all day as I have never worked on such a professional code base so far. But that leads to another issue now, How do I go about merging my solution or I guess I have to create a pull request now? I am not sure how this works!

pandas-datareader was checkedout locally to my IDE and I have made all the changes there, what do you suggest me to do? I am sure there are still things needed to be added to the codebase like requirements etc? idk this is my first time.

Thanks for any help.

raphi6 commented 1 year ago

File "/Users/yeison/miniforge3/envs/tf-metal-0.6.0/lib/python3.10/site-packages/pandas_datareader/yahoo/daily.py", line 153, in _read_one_data data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"] TypeError: string indices must be integers

I have found a fix, as stated in above comment. Do you know how about I will go creating a pull request? I just tried on my IDE and got permission denied? Any suggestions?

mazalkov commented 1 year ago

@raphi6 have you tried branching from the downloaded repository and then pushing your changes? It should say there's no upstream branch, by setting one you'll create a pull request.

raphi6 commented 1 year ago

Ahh yes branching, I always forget. So now I will checkout pandas-datareader, create a branch and add my changes, create a pull request. And that should work?

mazalkov commented 1 year ago

Yes I think so, it may not be the best way of doing it but I do not currently know of a better one. By setting an upstream branch and pushing your branch to it, it should create the PR.

raphi6 commented 1 year ago

Yes I think so, it may not be the best way of doing it but I do not currently know of a better one. By setting an upstream branch and pushing your branch to it, it should create the PR.

image

I just tried creating a new branch and commiting, pushing, then creating a PR and get the above, What did you mean by setting an upstream branch?

mazalkov commented 1 year ago

You may be trying to push directly to master, which only the repo owners will have permission for (although it is advised never to push directly to master in any case). To not take up replies in the thread, I would recommend looking into the process of opening a PR on GitHub for another repo using online resources:

Link 1 Link 2 LInk 3

Apologies I can't be any more help, quite new to Git in terms of contributing to 3rd party repos.

raphi6 commented 1 year ago

I've opened a pull request with working code, be sure to check it out if it works for you guys.

joanlofe commented 1 year ago

Hi, @raphi6 , I can confirm it works perfectly fine. Thanks a lot for the quick fix! Let's see if they approve the pull request.

For reference, I cloned your pull request and installed it using the following sequence of commands (in Ubuntu 20.04):

git clone https://github.com/raphi6/pandas-datareader.git
git checkout 'Yahoo!_Issue#952'
conda uninstall pandas-datareader
conda install pycryptodome pycryptodomex
python setup.py install --record installed_files.txt

The --record argument in the install command is to get a list of installed files, so that it is easy to uninstall in the future (following this SO thread). The pycrypto* files are dependencies I has to install to make it work.

pyproper commented 1 year ago

Hi, @raphi6 , I can confirm it works perfectly fine. Thanks a lot for the quick fix! Let's see if they approve the pull request.

For reference, I cloned your pull request and installed it using the following sequence of commands (in Ubuntu 20.04):

git clone https://github.com/raphi6/pandas-datareader.git
git checkout 'Yahoo!_Issue#952'
conda uninstall pandas-datareader
conda install pycryptodome pycryptodomex
python setup.py install --record installed_files.txt

The --record argument in the install command is to get a list of installed files, so that it is easy to uninstall in the future (following this SO thread). The pycrypto* files are dependencies I has to install to make it work.

Can this be installed or implemented on Google Colabs?

HiroshiOkada commented 1 year ago

The following seems to work well on Google Colab.

%%shell
git clone https://github.com/raphi6/pandas-datareader.git
cd pandas-datareader 
git checkout 'Yahoo!_Issue#952'
pip install pycryptodome pycryptodomex
python setup.py install --record installed_files.txt
pyproper commented 1 year ago

The following seems to work well on Google Colab.

%%shell
git clone https://github.com/raphi6/pandas-datareader.git
cd pandas-datareader 
git checkout 'Yahoo!_Issue#952'
pip install pycryptodome pycryptodomex
python setup.py install --record installed_files.txt

/usr/local/lib/python3.8/dist-packages/pandas_datareader/base.py:272: SymbolWarning: Failed to read symbol: RemoteDataError: No data fetched using 'YahooDailyReader' I get this which started to be an issue a year or so ago and have been using this since then pip install git+https://github.com/pydata/pandas-datareader.git Is it possible to put them together?

nichaelwichterle1 commented 1 year ago

Hello - I've got a few complex scripts running in jupiter notebook pulling data from yahoo finance - what are the exact commands which need to be entered in order to regain access so that I dont get the "string indices must be integers" error ? much appreciated, thx for the help

joanlofe commented 1 year ago

The following seems to work well on Google Colab.

%%shell
git clone https://github.com/raphi6/pandas-datareader.git
cd pandas-datareader 
git checkout 'Yahoo!_Issue#952'
pip install pycryptodome pycryptodomex
python setup.py install --record installed_files.txt

/usr/local/lib/python3.8/dist-packages/pandas_datareader/base.py:272: SymbolWarning: Failed to read symbol: RemoteDataError: No data fetched using 'YahooDailyReader' I get this which started to be an issue a year or so ago and have been using this since then pip install git+https://github.com/pydata/pandas-datareader.git Is it possible to put them together?

@pyproper Yes, it is. You can use this:

 pip install git+https://github.com/raphi6/pandas-datareader.git@ea66d6b981554f9d0262038aef2106dda7138316

Notice I am using the commit hash here instead of a branch name, because it is Yahoo!_Issue#952 and there is an issue with hash characters when using pip this way.

nichaelwichterle1 commented 1 year ago

Thanks very much, all looks good now!

pyproper commented 1 year ago

@pyproper Yes, it is. You can use this:

 pip install git+https://github.com/raphi6/pandas-datareader.git@ea66d6b981554f9d0262038aef2106dda7138316

Notice I am using the commit hash here instead of a branch name, because it is Yahoo!_Issue#952 and there is an issue with hash characters when using pip this way.

Perfect! Thank You

KryptoEmman commented 1 year ago

@joanlofe

@pyproper Yes, it is. You can use this:

 pip install git+https://github.com/raphi6/pandas-datareader.git@ea66d6b981554f9d0262038aef2106dda7138316

Notice I am using the commit hash here instead of a branch name, because it is Yahoo!_Issue#952 and there is an issue with hash characters when using pip this way.

I had to use pip3 to install, but now have the following error:

Traceback (most recent call last): File "USP.py", line 6, in from pandas_datareader import data as pdr File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/init.py", line 5, in from .data import ( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/data.py", line 11, in from pandas_datareader.av.forex import AVForexReader File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/av/init.py", line 5, in from pandas_datareader._utils import RemoteDataError File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/_utils.py", line 6, in from pandas_datareader.compat import is_number File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/compat/init.py", line 1, in from packaging import version ModuleNotFoundError: No module named 'packaging'

Thoughts?

joanlofe commented 1 year ago

@KryptoEmman Have you tried pip install packaging and then repeat again?

CKDarling commented 1 year ago

I've opened a pull request with working code, be sure to check it out if it works for you guys.

This fix by @raphi6 needs to be made a priority. Yahoo API is bricked through PDR.

raphi6 commented 1 year ago

I've opened a pull request with working code, be sure to check it out if it works for you guys.

This fix by @raphi6 needs to be made a priority. Yahoo API is bricked through PDR.

Thank you! if anyone could get into contact with someone that can accept the PR, that would be great!

KryptoEmman commented 1 year ago

@KryptoEmman Have you tried pip install packaging and then repeat again?

@joanlofe I ran that command with pip3 not pip - running on MacOS - which ran successfully, but now getting the following different error: Traceback (most recent call last): File "USP.py", line 6, in import pandas_datareader as pdr File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/init.py", line 5, in from .data import ( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/data.py", line 36, in from pandas_datareader.yahoo.actions import YahooActionReader, YahooDivReader File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/yahoo/actions.py", line 4, in from pandas_datareader.yahoo.daily import YahooDailyReader File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas_datareader/yahoo/daily.py", line 9, in from Crypto.Cipher import AES ModuleNotFoundError: No module named 'Crypto'

Clearly missing the required Crypto module using "pip3 install Crypto", but still getting the same error, so I'm guessing this must be from another package. Tried "pip3 install hashlib" which errors out with the exit code 1 and a message stating "unknown OS, please update setup.py" which leads me to think that some of these modules are only available for MacOS.

pip3 list outputs the following:

aes 1.2.0 certifi 2022.12.7 charset-normalizer 2.1.1 crypto 1.4.1 et-xmlfile 1.1.0 forex-python 1.8 idna 3.4 lxml 4.9.1 Naked 0.1.32 numpy 1.23.4 openpyxl 3.0.10 packaging 22.0 pandas 1.5.1 pandas-datareader 0+untagged.811.gea66d6b pip 22.3.1 python-dateutil 2.8.2 pytz 2022.6 PyYAML 6.0 requests 2.28.1 setuptools 65.5.0 shellescape 3.8.1 simplejson 3.17.6 six 1.16.0 urllib3 1.26.12

Thoughts? Thanks.

joanlofe commented 1 year ago

@KryptoEmman Try this: pip3 install pycryptodome pycryptodomex

KryptoEmman commented 1 year ago

@joanlofe

pip3 install pycryptodome pycryptodomex

Tried the command above but get the same result when I try to use pandas-reader with Yahoo: "ModuleNotFoundError: No module named 'Crypto'"

benfulloon commented 1 year ago

Confirming that using yfinance as an override for PDR worked for me as at today -> https://pypi.org/project/pandas-datareader/

from pandas_datareader import data as pdr

import yfinance as yf yf.pdr_override() # <== that's all it takes :-)

download dataframe

data = pdr.get_data_yahoo("SPY", start="2017-01-01", end="2017-04-30")

raphi6 commented 1 year ago

You can also checkout the fix in the Pull Request associated with this bug, and should work aswell without yfinance :)

kingsman1960 commented 1 year ago

@joanlofe

pip3 install pycryptodome pycryptodomex

Tried the command above but get the same result when I try to use pandas-reader with Yahoo: "ModuleNotFoundError: No module named 'Crypto'"

I also saw this problem, so I am now running with Stooq instead of using Yahoo finance.

raphi6 commented 1 year ago

pycryptodome

Check the documentation here for installing the "Crypto" / cryptodome package: https://pypi.org/project/pycryptodome/

ashraf2000shweky commented 1 year ago

pandas-ta is works 👍👍 df = pd.DataFrame() data = df.ta.ticker('aapl',period='1y',interval='1d')

wilare commented 1 year ago

Solved I was having the same problem. Version 2.3 of yfinance was released on 12/20. Uninstall the old version and install the new one.

raphi6 commented 1 year ago

Solved

I was having the same problem. Version 2.3 of yfinance was released on 12/20.

Uninstall the old version and install the new one.

Yeah but that is a different package, they also had the same problem however their PR has been accepted. We are just waiting for ours to get accepted.

uad1098 commented 1 year ago

I use call Python with my java trading program and do not know much about python. This and #953 is confusing.

Is #952 addressing fixing pandas datareader to work with yahoo return or is it addressing only a workaround with yfinance which I do not have, or is #953 addressing fixing datareader. ?

My code is very simple and has not worked since 12/19/22 when Yahoo changed something in their response to df = web.DataReader(stock, 'yahoo', start, end). Should I wait until datareader is fixed to work with yahoo, or is there I workaround I can use other than yfinance? Thanks for any response.

import sys

startDate = sys.argv[1]

endDate=sys.argv[2]

stock=sys.argv[3]

import pandas_datareader as web import datetime

start = datetime.datetime(2022, 12, 15)#yy, m, d end = datetime.datetime(2022, 12, 16) stock="gld" #fix to test with idle

df = web.DataReader(stock, 'yahoo', start, end) print(df[['High','Low','Close']])

path= 'c:/PythonPrograms/CallPythonApp/'+stock+"Update1.csv" df.to_csv(path) #saved in pythonProgramsOutput directory print() print(stock,": start date:",start," and end date:",end)

looper15 commented 1 year ago

Can any one help me out with this issue

I have used yfinance as an override for PDR but getting

This error

1 F1 Failed download:
- AAPL: No data found for this date range, symbol may be deliste

This is My code

from pandas_datareader import data as pdr
import yfinance as yf
yf.pdr_override()

user_input = st.text_input('Enter Stock Ticker', 'AAPL')

df = pdr.get_data_yahoo(user_input, start="2017-01-01", end="2017-04-30")
display(df)
raphi6 commented 1 year ago

I use call Python with my java trading program and do not know much about python. This and #953 is confusing.

Is #952 addressing fixing pandas datareader to work with yahoo return or is it addressing only a workaround with yfinance which I do not have, or is #953 addressing fixing datareader. ?

My code is very simple and has not worked since 12/19/22 when Yahoo changed something in their response to df = web.DataReader(stock, 'yahoo', start, end). Should I wait until datareader is fixed to work with yahoo, or is there I workaround I can use other than yfinance? Thanks for any response.

import sys

startDate = sys.argv[1] #endDate=sys.argv[2] #stock=sys.argv[3]

import pandas_datareader as web import datetime

start = datetime.datetime(2022, 12, 15)#yy, m, d end = datetime.datetime(2022, 12, 16) stock="gld" #fix to test with idle

df = web.DataReader(stock, 'yahoo', start, end) print(df[['High','Low','Close']])

path= 'c:/PythonPrograms/CallPythonApp/'+stock+"Update1.csv" df.to_csv(path) #saved in pythonProgramsOutput directory print() print(stock,": start date:",start," and end date:",end)

The fix regarding this issue ( #953 ) is a pandas-datareader fix, if you checkout my repo from the fix it should work like it did before Yahoo! Finance changed their response (started encrypting it)

raphi6 commented 1 year ago

Can any one help me out with this issue

I have used yfinance as an override for PDR but getting

This error

1 F1 Failed download:
- AAPL: No data found for this date range, symbol may be deliste

This is My code

from pandas_datareader import data as pdr
import yfinance as yf
yf.pdr_override()

user_input = st.text_input('Enter Stock Ticker', 'AAPL')

df = pdr.get_data_yahoo(user_input, start="2017-01-01", end="2017-04-30")
display(df)

You might be better off asking this in the yfinance GitHub page as well, they will be more familiar with it

KryptoEmman commented 1 year ago

@joanlofe

pip3 install pycryptodome pycryptodomex

Tried the command above but get the same result when I try to use pandas-reader with Yahoo: "ModuleNotFoundError: No module named 'Crypto'"

FYI. I was able to finally get this working by uninstalling the crypto, pycryptodome and pycryptodomex packages (using pip3 uninstall command) and then re-running the pycryptodome and pycryptodomex packages using the command above "pip2 install pycryptodome pycryptodomex"

spot92 commented 1 year ago

Commenting for update notifications.

nmaiorana commented 1 year ago

I tried the above by uninstalling/installing crypto, pycryptodome and pycryptodomex. I'm still getting unintelligible data back from Yahoo.

I'm not sure this is a change to the datareader package. Definitely something with encryption.

raphi6 commented 1 year ago

I tried the above by uninstalling/installing crypto, pycryptodome and pycryptodomex. I'm still getting unintelligible data back from Yahoo.

I'm not sure this is a change to the datareader package. Definitely something with encryption.

You have to use the fixed version of pandas-datareader from #953, the encryption function is added which fixes this issue. Just waitng on someone to merge it in.

nmaiorana commented 1 year ago

Is it in a branch? If it was I could just pull down the branch and rebuild locally.

joanlofe commented 1 year ago

Is it in a branch? If it was I could just pull down the branch and rebuild locally.

Take a look at my response here.

nmaiorana commented 1 year ago

That did it!

Thanks.

raphi6 commented 1 year ago

Sometimes I get the following error, if anyone could help out would be greatly appreciated :)

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

KryptoEmman commented 1 year ago

Sometimes I get the following error, if anyone could help out would be greatly appreciated :)

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

Yeah, I started to get the same error all of sudden Today. I added a 1 second pause (using the sleep() function) between sequential PDR queries in my script which worked most of the day, but then ran into the same issue again later in the day and for which a 2 second pause seems to have now fixed it once more, but who knows for how long. Playing with the PDR's retry_count and pause parameters did not seem to help much either, BTW.

More of a workaround than a solution really, but hope that helps nonetheless for now.