Open dgunning opened 3 months ago
Copied from Issue 73 For some 10Q imports, some facts are missing when querying the facts table.
For example, in the latest 10Q (Q2 2024) for $GD, the 10Q contains rows for Costs of Products and Services (us-gaap:CostOfGoodsAndServicesSold) but this fact is never loaded into the facts table in Edgar-tools or in the income-statement printed.
Likewise for "us-gaap:InterestIncomeExpenseNet" and "us-gaap:OtherNonoperatingIncomeExpense" facts.
This may possibly be related to these fields having a role of "http://fasb.org/us-gaap/role/ref/legacyRef" while most of the facts that do get loaded have a role of "http://www.xbrl.org/2003/role/disclosureRef"
Progress so far
Note from https://github.com/emestee Hey,
If this helps, here are the entry points from the FASB taxonomy that group the line items in the mandatory filing statements:
@emestee what do you know about standardized statements vs as-reported statements? Do you know what defines the standard concepts that all companies include in their statements?
circling back here as ive been playing with the new upgrades. Thanks for this - looks like a lot of work went into the rewrite.
Were you able to pull in the productMembers as referenced here https://github.com/dgunning/edgartools/issues/66#issuecomment-2243569825
I have been trying to pull in the concepts that feed into the Revenue Sales, but still can't figure that out correctly.
The is the code snippet I have (eg for the AAPL XBRL instance)
# ----
# Extract detailed revenue items
# ----
revenue_sources = []
rev_dimensions = instance.dimensions
rev_dimension_value = rev_dimensions['srt:ProductOrServiceAxis']
facts = rev_dimension_value.get_facts()
period_date_str = latest_date.strftime('%Y-%m-%d')
# get the facts for this latest period
latest_period_facts = facts[facts['end_date'] == period_date_str][facts["duration"] == "3 months"]
for index, row in latest_period_facts.iterrows():
if row.concept.startswith("us-gaap:Revenue"):
print(row.value, row.concept, row.dimensions)
Running this against AAPL Q2 10Q, I get the following facts:
61564000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'us-gaap:ProductMember'}
24213000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'us-gaap:ServiceMember'}
39296000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:IPhoneMember'}
7009000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:MacMember'}
7162000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:IPadMember'}
8097000000 us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax {'srt:ProductOrServiceAxis': 'aapl:WearablesHomeandAccessoriesMember'}
The problem is that the us-gaap:ProductMember actually represents the total of all the individual product lines, so I end up double counting the product values.
Like, how do you determine that the apple:* members are part of the productMember, while the serviceMember is on its own without any nesting?
I want this product breakdown to work for any 10-Q so I can present where a companies incoming revenue comes from.
You can see the dimensions using the dimensions
attribute
And you can query by dimensions
@emestee what do you know about standardized statements vs as-reported statements? Do you know what defines the standard concepts that all companies include in their statements?
I actually don't know, I imagine it is SEC regulation derived from federal law and incorporating FASB rules. I also don't think it should matter. SEC filings are validated upon submission and can be assumed to be compliant with technical requirements (otherwise the SEC parser will reject the filing). You should not assign any specific meaning to any items in the statement, other than their relationships to parent items, if any.
I think I want to add a parameter that switches between as_reported - the rows and labels that the company wants to show and standard - a common set of values and labels that all companies are required to report.
I think Bloomberg operates like that no?
thats kind of what im doing where I use this library to pull in the data but then standardize things into my own fields.
You prev had a version of that in your old income statement code, but the challenge is getting a mapping of all the fields into something.
You can see the dimensions using the
dimensions
attributeAnd you can query by dimensions
![]()
So with this, I am already getting the aapl:* dimensions. But see how there is also the 'us-gaap: ProductMember' in the srt:ProductOrServiceAxis. This ProductMember happens to be the sum of all the aapl dimensions. Meanwhile the ServicesMember in this example doesn't have sub items. Is there a way to know if a dimension is a total of other sub items (when you pull up the xbrl viewer on the sec site the items get indented so I assume the info is somewhere).
My use case is to be able to pull the product revenue sources generically for all 10K/10Q imports, so not necessarily Apple specific.
thats kind of what im doing where I use this library to pull in the data but then standardize things into my own fields.
You prev had a version of that in your old income statement code, but the challenge is getting a mapping of all the fields into something.
I'm struggling in this standardization now, would you please share the approach or some code to give me an idea? --I'm new to financial data and SEC but want to standardize financial statements for many companies. @amitgandhinz
The new version is working much better on the financials statements! Thanks a lot for that!
I think there is a small thing that could be highly improved in the cash flow statement. If I look at Apple, it seems like a lot of the lines are positive instead of being negative (like Share repurchases).
@Colem19 @dgunning The statements are great; Dwight has done a great job. You are right though, good eye on identifying this issue. As an example I pulled (ULTA)'s most recent 10-Q via the TENQ-Class and I have identified a few more line items that either should have a negative output , positive output or its in reverse order (meaning 2024 should be positive and 2023 should be a negative). @dgunning do you anticipate that this will be fixed or is something that is fixable?
@Colem19 @dgunning To be clear I highlighted in yellow what should be negative for both years. And I commented what should be reversed and what should be positive.
I verified and most values are in the right direction
Deferred Income Tax does not match the Filing table in the HTML.
The raw value does not match in the XBRL instance file but I suspect that there seems to be a bug in the XBRL calculation file.
The calculation files usually have weights of -1 for negative values but it is missing in this case.
Not much the code can do in this case
Version 3 of XBRL financials