ranaroussi / yfinance

Download market data from Yahoo! Finance's API
https://aroussi.com/post/python-yahoo-finance
Apache License 2.0
13.29k stars 2.35k forks source link

Financial values (Income Statement) are differnt between website and yfinance #1004

Closed prischon closed 1 year ago

prischon commented 2 years ago

Values for example EBIT or Gross Profit are always diferent. I cheked differend companies (IBM, Google, Deutsche Wohnen and so on). The values are always different between webside and yfinance

a part of the code: IBM = yf.Ticker("IBM") print(IBM.financials)

then compare the values between the site (see link) and output values from python instruction.

ValueRaider commented 2 years ago

I did a little digging and the fault lies with Yahoo not yfinance.

Check for yourself - the data returned by API is present in HTML source. Right click -> view page source, and search for e.g. "grossProfit", you'll see it's different to what is being displayed in table.

Can you hunt out an actual earnings report and tell us which is correct?

prischon commented 2 years ago

Meanwhile I checked beautifulsoup over „yahoofinancials“. The library “yahoofinancials” is old and does not work, but the call “beautifulsoup” works. Over the call beautifulsoup I got the correct data for EBIT (the name of value was “anualEBIT”). Also other values were identical with values on the site https://finance.yahoo.com/. The beautifulsoup scraped the data from the site https://finance.yahoo.com/.

From my point of view the API “yfinance” works worser as beatifulsoup. I hope the yfinance get some improvement.

prischon commented 2 years ago

Now I checked the HTML code. Inside the code are present the identical volumenas to the site for EBIT in annual format (see name inside the picture for "annualEBIT"). I see no problem on yahoo site, because the correct values are present

grafik

ValueRaider commented 2 years ago

yahoofinancials is returning the same value as yfinance:

from yahoofinancials import YahooFinancials as YF
tick = YF("IBM")
print(tick.get_ebit())

5772000000

Please post your BeautifulSoup code.

prischon commented 2 years ago

prischon commented 2 years ago

I think, I wasn't precise enough.

What I done: I put some break points during the execution and cheek the conten of variables. I did no change any code line inside yahoofinancials.

Now I done the same with the code of yfinance and I saw the same problem. Please put a break point inside the file "utils.py"

grafik

Explained flatly: With the fist stop at the break point (row 109) you will not see the annualEBIT inside the variable "json_str", but if if you keep the program running, then the program stop second time at the same position. After the second time please check the content of the variable "json_str". You will find the string "annualEBIT".

From my point of view. The algorithm after row 109 is not more fit the the return text from yahoo.com.

ValueRaider commented 2 years ago

I did not change yahoofinancials yet it gives same EBIT value as yfinance.

get_json() is called from different parts of yfinance for different purposes so breakpoints in it not helpful.

Be more precise - provide code that reproduces bug, and additional code that gives correct answer (with yahoofinancials?). Because I still see correct behaviour.

prischon commented 2 years ago

a. Actual code of yahoofinancials reproduce the bug. (Stock => IBM)

b. Here is my code to get text with numbers for "annualEBIT" for the company IBM as example. The code provide a text see variable "soup" with correct numbers. Please search inside the HTML-Text for "annualEBIT" inside the variable "soup". At the moment I am not so far to extract the numbers from HTML-Text.

BS_example.txt

I hope it helps to start to close the bug.

Additional Information's: I cheeked also the values inside the table "Income Statement" for IBM on the page of Yahoo. The numbers are very similar to other financial pages. In parallel I will try to write the code for extraction of correct numbers.

ValueRaider commented 2 years ago

Please search inside the HTML-Text for "annualEBIT" inside the variable "soup"

This finds 5992000000 (5.992B) for year ending 2021-12-31.

But the soup also contains "ebit" in "incomeStatementHistory" which states 5772000000 (5.772B) for epoch 1640908800 aka 2021-12-31.

Why do you think "annualEBIT" is correct and "ebit" wrong?

prischon commented 2 years ago

Why do you think "annualEBIT" is correct and "ebit" wrong?

  1. 5.992B is the identical value as on the side of Yahoo for annual values
  2. 5.992B is similar* value as on other financial side and no one side content 1.640B for 2021-12-31. Strange but, also yahoo side do not content 1.640B.

Additionally for other values: This are the values that everyone get from yfinance for IBM (see picture). image

Most values in the picture are quite different to the values from Yahoo side (for example: The yfinance output provide for "Operating Income" identical values as for "EBIT" link)

*Similar: As I know in US exist two different ways to present the financial/balance data. But we use values from Yahoo and would like have the same values from the site in our calculations algorithms.

ValueRaider commented 2 years ago

Ok, I understand now. The issue is that Yahoo is returning 2 different sets of financial data in different parts of the html json. One is parsed by yfinance, but Yahoo website shows the other.

I discovered someone already proposed a fix to this, see https://github.com/ranaroussi/yfinance/pull/776. You can clone their branch and try it, give feedback.

prischon commented 2 years ago

You can clone their branch and try it, give feedback.

also, with this version of code from git-shogg is the problem present. His code provide the identical values like the main version of yfinance.

I did not analyze of his code adaptation. I done just the execution of his code.

ValueRaider commented 2 years ago

Their code definitely returns different values, I tried it. If you didn't get error "has no attribute 'financials'" then you didn't load their fork ; you can also verify with print(yf)

prischon commented 2 years ago

thanks for fast feedback! I will check it today evening.

prischon commented 2 years ago

After the check, I am 100% sure, that the code from git-shogg provide identical values as actual version of yfinance (main version). Sorry for negative news.

ValueRaider commented 2 years ago

Then you aren't actually running git-shogg code. How did you verify you loaded their code instead of official?

prischon commented 2 years ago

Then you aren't actually running git-shogg code. How did you verify you loaded their code instead of official?

  1. Replace all Pythen files of yfinance manualy
  2. Restart the computer
  3. Then I start the script and cheked over the debug (breakpoints) what file was loaded and also the content of variables

Did I miss something?

ValueRaider commented 2 years ago

That sounds like it should be git-shogg. Did IBM.financials cause error? It should because git-shogg changed name.

A different way to load is sys.path.insert (examples online). Doesn't interfere with PIP installs. Verify with print(yf)

prischon commented 2 years ago

Did IBM.financials cause error?

thanks for the hint about "IBM.financials". It tooks one hour to find: I hat two installation of yfinance on my Ubuntu. (Please do not ask me, what the F#*#k hapens).

Now I saw the results: The values for EBIT from git-shogg are identical with the table on Yahoo site.

ValueRaider commented 1 year ago

Correct financials are now available in pre-release version 0.2.0rc4.