pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.
https://pydata.github.io/pandas-datareader/stable/index.html
Other
2.91k stars 680 forks source link

incorrect volume data from yahoo #945

Open dss010101 opened 1 year ago

dss010101 commented 1 year ago

Seems yahoo may be returning incorrect volume data

import pandas_datareader as pdr
df = pdr.get_data_yahoo('^GSPC', start = '2022-10-05')
df

This returns:

    High    Low Open    Close   Volume  Adj Close
Date                        
2022-10-05  3806.909912 3722.659912 3753.25 3783.280029 4293180000  3783.280029

Volume is shown as 4.29b, but it should be 2.5b according to several sources such as NYSE.
Anyone know if this is a known issue with yahoo or perhaps a new issue?

datatalking commented 1 year ago

@msingh00 What version of python and what OS are you using?

dss010101 commented 1 year ago

i still do see this. at the time of this ticket i was running Win10, im not running 11. python version Python 3.10.5

I wonder if this is a yahoo issue more than anything else?

dss010101 commented 1 year ago

interesting that the last day, today's is about half of the previous days image

datatalking commented 1 year ago

@msingh00 I'll need a sample of the code section you are running and to see the headers of that snippet of data, otherwise i'm hunting in the dark.

Need Stock ticker, stock exchange etc you referenced Snippet of code Headers of stock mentioned

dss010101 commented 1 year ago

the original ticket above has the code. that's it. that's all i run in a jupyter notebook and do some data checking of volume with other sources. for convenience..in case for some reason u can scroll up...here it is


import pandas_datareader as pdr
df = pdr.get_data_yahoo('^GSPC', start = '2022-10-05')
df
datatalking commented 1 year ago

the original ticket above has the code. that's it. that's all i run in a jupyter notebook and do some data checking of volume with other sources. for convenience..in case for some reason u can scroll up...here it is


import pandas_datareader as pdr
df = pdr.get_data_yahoo('^GSPC', start = '2022-10-05')
df

Did you read my reply that the repo specifically says python 3.6 and 3.7?