Open rynkwn opened 8 years ago
Some of its numbers, for example DP.DPL for Depreciation.Depletion, don't appear to come from the correct source. In fact, I'm unsure where it's getting those numbers from. Checking the adjoining rows doesn't make it clear that some off-by-one error is assigning LE's values elsewhere, but it's possible that it's incremented sufficiently by that point that the "off-by-one" error is pretty enormous. That, and logically then the last X rows should essentially be lacking data. Something that isn't true.
Will look into this more deeply tomorrow.
Playing with it now.
So taking a small sample of ~100 companies around LE still attributes incorrect data to LE. However, the get_info function on LE alone does produce correct data.
If I take the last company in my subset, LPI, I find that the raw financials does indeed produce the correct cash flow statement.
Hm. Perhaps quantmod is grabbing the wrong data?
Scratch that, it seems as if the correct data is being grabbed by get_info. Unsure why I thought otherwise just above. Will check tidyinfo.
Tidyinfo is producing the correct data. Though it's possibly being somewhat harsh in cutting out 2012 data for LE due to the lack of a balance sheet, it doesn't seem pressing.
Will try re-generating our financials data set to see if inaccuracy persists. Maybe data set was produced by an older, buggy version?
Correction*
Upon closer inspection, my conclusion is that get_info/tidyinfo is correctly grabbing and producing data. My guess is that, as we only have access to the past 4 10-K filings, we were grabbing weird/old data from LE.
To be completely sure, I've also re-built our financials data set. LE now has reasonable data. On top of that, it's also moved forward
It's also moved forward around 700 rows. Implicitly this means that there was an enormous gain in information in the preceding companies, which is also concerning.
Ex: In our financials data frame:
However, looking directly at the quantmod data, we have:
CASHFLOWS:
BALANCESHEETS:
INCOME STATEMENTS:
In other words, either get_info or tidyinfo is poorly handling its receipt of information to incorrectly store data. Will look into this further.