gp1981 / SEC_data_analysis

This repository contains R scripts for retrieving and analyzing company data from the U.S. Securities and Exchange Commission (SEC) filings.
https://gp1981.github.io/SEC_data_analysis/
MIT License
0 stars 1 forks source link

Financial report from SEC data #1

Open gp1981 opened 7 months ago

gp1981 commented 7 months ago
  1. [x] Retrieve Financial Statements based on standardised structure
  2. [ ] Validate Financial Statements based on standardised structure
  3. [ ] Adjust for sign of specific Facts (financial items)
  4. [ ] Calculate key parameters for fundamental analysis

Validate Financial Statements based on standardised structure Income Statement

Cash Flow

Balance Sheet


Income Statement Sampling verification with SEC with TTM

gp1981 commented 2 weeks ago

"_Interest Expense12:m" The figures in xblr must be (almost always) entered as positive number, based on their definition. Note the Staff Observations from the Review of Interactive Data Financial Statements (from December 13, 2011)

The following financial items must be checked and treated separately in the calculations and visualisation (if to be rendered as in the filing). Statement If the definition contains this language the element can be negative
Cash Flow Increase (decrease)
Cash flow Provided by (used in)
Cash flow Net
Cash flow Change in
Cash flow Proceeds from (payments for)
Cash flow Proceeds from (payments to)
Income statement Gain (Loss)
Income statement Profit (Loss)
Income statement Income (expense)
Income statement Per share
Statement of Stockholders Equity Equity
Statement of Stockholders Equity Retained Earnings
gp1981 commented 2 weeks ago

5c8c4af23eb52ae869571f758288036ee2a677ba

gp1981 commented 2 weeks ago

The calculation of the following standardized_label is to be fixed (ref. End = 2021-12-31) for Cash Flow TTM

gp1981 commented 1 week ago

Problem identified: remove the correct val when filtering out the duplicated val In line 846 duplicated standardized_label are filtered keeping only the "first" occurrence.

Suggested action: reconsider the function of removing duplicates "after" the sum of val for the same standardized_label for the same fiscal period

gp1981 commented 1 week ago

by removing line 846

  # Filter out rows with duplicated val for the same standardized label
  # df_std_CF <- df_std_CF %>%
  #   # Group by the standardized label for the same year and same quarter
  #   group_by(standardized_label, year_end, quarter_end) %>%
  #   # Arrange to by quarters_end and descending date "filed"
  #   arrange(desc(quarter_end),desc(filed)) %>%
  #   # Keep only the first occurrence of 'val' within each group
  #   filter(!duplicated(standardized_label,val)) %>%
  #   ungroup()

it causes double entry in the df_std_CF after pivoting.

gp1981 commented 1 week ago

Problem identified: remove the correct val when filtering out the duplicated val In line 846 duplicated standardized_label are filtered keeping only the "first" occurrence.

  • Removal of duplicated val is also performed in df_std_IS.
  • Removal of duplicated val is performed prior to the sum of val for the same standardized_label for the same fiscal period

Suggested action: reconsider the function of removing duplicates "after" the sum of val for the same standardized_label for the same fiscal period

and

by removing line 846

  # Filter out rows with duplicated val for the same standardized label
  # df_std_CF <- df_std_CF %>%
  #   # Group by the standardized label for the same year and same quarter
  #   group_by(standardized_label, year_end, quarter_end) %>%
  #   # Arrange to by quarters_end and descending date "filed"
  #   arrange(desc(quarter_end),desc(filed)) %>%
  #   # Keep only the first occurrence of 'val' within each group
  #   filter(!duplicated(standardized_label,val)) %>%
  #   ungroup()

it causes double entry in the df_std_CF after pivoting.

Solved . (In the line 857 group_by of df_std_CF_pivot removed all other parameters for the grouping e.g. sic, sicDescription, etc.

gp1981 commented 1 week ago

The calculation of the following standardized_label is to be fixed (ref. End = 2021-12-31) for Cash Flow TTM

  • [x] (Operating Activities) Cash Flow Depreciation, Depletion, Ammortization_12m
  • [x] sign - (Operating Activities) Change in Accounts Receivable_12m ... .> * [ ] (Operating Activities) Cash Flow from Operating Activities_12m ...

To be solved: (Operating Activities) Cash Flow from Operating Activities_12m