OpenBB-finance / OpenBB

Investment Research for Everyone, Everywhere.
https://openbb.co
Other
33.88k stars 3.1k forks source link

[FR]: use SEC company facts for equity fundamental commands #6654

Open dijonkitchen opened 2 months ago

dijonkitchen commented 2 months ago

What's the problem of not having this feature? The SEC filings already have a ton of financial statement data/facts.

These are free for the public, but are not organized according to OpenBB's data model for wide usage.

This would allow users to have a good, default, free alternative to all the other data providers.

Describe the solution you would like Since SEC company facts are now incorporated: https://docs.openbb.co/platform/reference/equity/compare/company_facts, the equity fundamental commands can be implemented for any company: https://docs.openbb.co/platform/reference/equity/fundamental

There may be some nuances with the naming of things like Revenues vs RevenueFromContractWithCustomerExcludingAssessedTax, but OpenBB can handle that in the transform.

Describe alternatives you've considered Manually using company facts.

Additional information N/A

deeleeramone commented 2 months ago

There may be some nuances with the naming of things like Revenues vs RevenueFromContractWithCustomerExcludingAssessedTax, but OpenBB can handle that in the transform.

This is an understatement, but I can appreciate the sentiment and agree that there is a general need for this type of data access within the open source community.

There are many critical nuances that go along with the financial statement items, and this isn't something you can just "ask ChatGPT" expecting an answer that is factually correct and usable in the real world. I'll highlight some of the complex challenges associated with standardizing raw SEC data for use as a continuous time series that is directly comparable across companies and industries. If you have expertise in any particular area, feel free to jump in and help solve some of the larger problems.

Inputs we need to make this happen:

If anyone would like to help out for the greater good, please indicate in the comments and we can divide-and-conquer.

gtkacz commented 1 month ago

@deeleeramone I'd be willing to help out however I can!

deeleeramone commented 1 month ago

@deeleeramone I'd be willing to help out however I can!

Awesome! What's the best way to leverage your strengths and areas of expertise?

gtkacz commented 1 month ago

@deeleeramone I'd be willing to help out however I can!

Awesome! What's the best way to leverage your strengths and areas of expertise?

Honestly just list out whatever you need for us to get started and I could point out whichever portion I feel most comfortable doing. I have experience in web-scraping and software engineering if that helps.

deeleeramone commented 1 month ago

Honestly just list out whatever you need for us to get started and I could point out whichever portion I feel most comfortable doing. I have experience in web-scraping and software engineering if that helps.

Scraping the web is not really applicable to what needs to happen here. What we need is:

A hierarchal dictionary of standardized line items based on tags, and/or adding/subtracting of several tags, that form templates for each of 3 financial statements mapping single companies across time over the various reporting styles.

This requires a lot of background knowledge specific to US-GAAP accounting, SEC filings, and the XBRL language. Where we're going, there is no "follow these simple steps...", some assembly required.

gtkacz commented 1 month ago

The only thing of those I'm familiar with is XBRL, but I'd be willing to learn to help out however I can!

dijonkitchen commented 1 month ago

Perhaps we're still missing a piece of the SEC API that'd be more focused: Single company concepts. https://www.sec.gov/search-filings/edgar-application-programming-interfaces#:~:text=and%20across%20time.-,data.sec.gov/api/xbrl/companyconcept/,-The%20company%2Dconcept This way, people can get the all the historical data for one line item within a financial statement. We can then build up whole financial statements from there. For example: Just the Accounts Payable amounts for Alphabet, https://data.sec.gov/api/xbrl/companyconcept/CIK0001652044/us-gaap/AccountsPayableCurrent.json rather than everything for Alphabet: https://data.sec.gov/api/xbrl/companyfacts/CIK0001652044.json

There seem to be only a limited number of company facts, so when there are multiple for one, we can merge them together and use the latest one.

Thoughts?

dijonkitchen commented 1 month ago

Actually, I do see there is company concepts already used by company facts: https://github.com/OpenBB-finance/OpenBB/blob/develop/openbb_platform/providers/sec/openbb_sec/utils/frames.py#L186

But we'd need it to be able to take in a fiscal_period so that we can get one row of an income statement at a time: https://github.com/OpenBB-finance/OpenBB/blob/develop/openbb_platform/providers/sec/openbb_sec/models/compare_company_facts.py#L161-L170

Made a draft PR in #6685 if y'all want to take a look and/or improve.

deeleeramone commented 1 month ago

There seem to be only a limited number of company facts, so when there are multiple for one, we can merge them together and use the latest one.

This list is somewhat comprehensive, but will not be complete. It was compiled manually by cross-examining a selection of recent XBRL filings, extracting the GAAP facts, and checking the Frames API for support. I could easily have missed several hundred facts, but I believe I covered the general broad strokes. Enough, at least, to justify providing choices so the user does not have to guess what they might be.

deeleeramone commented 1 month ago

The only thing of those I'm familiar with is XBRL, but I'd be willing to learn to help out however I can!

Have a look through this - http://www.xbrlsite.com/2015/fro/us-gaap/html/ReportFrames/ - which contains XBRL schemas and mappings for the various types of reporting entity. What we are particularly interested in is how the "Try Order" can be used to build our hierarchical dictionary for extracting the fundamental accounting concept to the XBRL US-GAAP Taxonomy Concept. Screenshot 2024-09-27 at 11 43 40 AM

The potential workflow would look something like this:

This would result in a dictionary that would be sectioned into "balance", "income", "cash", and probably "supplementary".

Within each of these, we would have the fundamental accounting concepts as ordered keys (which would represent an item from the particular statement) and the values would be a list of the US-GAAP Taxonomy Concept(s), ordered in the "Try Order".

With this information, we would be able to reliably structure financial statements directly from the CompanyFacts API output. Getting full year tables would be a first reasonable goal after creating this dynamic mapping workflow.

What are you thoughts?

gtkacz commented 1 month ago

@deeleeramone seems perfect for me, will get started on mapping as soon as I have the time! Couple of questions:

deeleeramone commented 1 month ago

@gtkacz:

How would we store the map of tickers to CIK?

This one already exists, and can be imported:

from openbb_sec.utils.helpers import symbol_map

cik = await symbol_map("AAPL")
cik
'0000320193'
  • Could you please provide a simple schema of exactly how you want the data to be outputted to?

This may need some tweaking, and I'm totally open to suggestions, but here's the general idea. "RollUp" indicates that item is an "Abstract" - which means it has children and is displayed as an indented level.

From: http://www.xbrlsite.com/2015/fro/us-gaap/html/ReportFrames/COMID-BSC-CF1-ISM-IEMIB-OILY-SPEC6/index.html

{
    "balance": [
        {
            "line_item": "fac:AssetsRollUp",
            "order": 1,
            "level": 1,
            "children": [
                {
                    "line_item": "fac:CurrentAssets",
                    "order": 1.1,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:AssetsCurrent"],
                    "children": [],
                },
                {
                    "line_item": "fac:NonCurrentAssets",
                    "order": 1.2,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:AssetsNoncurrent"],
                    "children": [],
                },
                {
                    "line_item": "fac:Assets",
                    "order": 1.3,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:Assets", "us-gaap:AssetsCurrent"],
                    "children": [],
                },
            ],
        },
        {
            "line_item": "fac:LiabilitiesEquityRollUp",
            "order": 2,
            "level": 1,
            "children": [
                {
                    "line_item": "fac:LiabilitiesRollUp",
                    "order": 2.1,
                    "level": 2,
                    "children": [
                        {
                            "line_item": "fac:CurrentLiabilities",
                            "order": 2.101,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:LiabilitiesCurrent"],
                            "children": [],
                        },
                        {
                            "line_item": "fac:NoncurrentLiabilities",
                            "order": 2.102,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:LiabilitiesNoncurrent"],
                            "children": [],
                        },
                        {
                            "line_item": "fac:Liabilities",
                            "order": 2.103,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:Liabilities"],
                            "children": [],
                        },
                    ],
                },
                {
                    "line_item": "fac:CommitmentsAndContingencies",
                    "order": 2.2,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": ["us-gaap:CommitmentsAndContingencies"],
                    "children": [],
                },
                {
                    "line_item": "fac:TemporaryEquity",
                    "order": 2.3,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": [
                        "us-gaap:TemporaryEquityCarryingAmountIncludingPortionAttributableToNoncontrollingInterests",
                        "us-gaap:RedeemablePreferredStockCarryingAmount",
                        "us-gaap:TemporaryEquityValueExcludingAdditionalPaidInCapital",
                    ],
                    "children": [],
                },
                {
                    "line_item": "fac:EquityRollUp",
                    "order": 2.4,
                    "level": 2,
                    "children": [
                        {
                            "line_item": "fac:EquityAttributableToParent",
                            "order": 2.401,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:StockholdersEquity",
                                "us-gaap:PartnersCapital",
                                "us-gaap:MembersEquity",
                            ],
                            "children": [],
                        },
                        {
                            "line_item": "fac:EquityAttributableToNoncontrollingInterest",
                            "order": 2.402,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:MinorityInterest",
                                "us-gaap:PartnersCapitalAttributableToNoncontrollingInterest",
                                "us-gaap:MembersEquityAttributableToNoncontrollingInterest",
                            ],
                            "children": [],
                        },
                        {
                            "line_item": "fac:Equity",
                            "order": 2.403,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:StockholdersEquityIncludingPortionAttributableToNoncontrollingInterest",
                                "us-gaap:PartnersCapitalIncludingPortionAttributableToNoncontrollingInterest",
                                "us-gaap:LimitedLiabilityCompanyLlcMembersEquityIncludingPortionAttributableToNoncontrollingInterest",
                            ],
                            "children": [],
                        },
                    ],
                },
                {
                    "line_item": "fac:LiabilitiesAndEquity",
                    "order": 2.5,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": [
                        "us-gaap:LiabilitiesAndStockholdersEquity",
                        "us-gaap:LiabilitiesAndPartnersCapital",
                    ],
                    "children": [],
                },
            ],
        },
    ]
}