toddwschneider / sec-13f-filings

A nicer way to view SEC 13F filings data
https://13f.info
MIT License
248 stars 52 forks source link
finance financial-data stock-market

SEC 13F Filings

The code for 13f.info, a more user-friendly way to view SEC 13F filings—the quarterly reports that list certain equity assets held by institutional investment managers

The Rails app has two primary functions:

  1. A back end that downloads 13F data from the SEC's EDGAR system and processes it into a structured PostgreSQL database
  2. A front end that provides a way to view the processed data

Even if you don't care about the front end, you might find the code helpful purely for maintaining a relational database of 13F holdings reports

Live examples

Some example links to showcase the app's functionality:

It might be helpful to compare the above to the SEC website's version of the Berkshire Hathaway Q4 2020 13F

Limitations & caveats

Tl; dr: the SEC does not review filings for accuracy, there don't appear to be many validations on the SEC's side to ensure valid submissions, and even if everything is accurate, 13Fs still don't paint a complete picture of a manager's positions and/or investment outlook. Please do your own research before drawing any conclusions from 13F data

Some other notable limitations more specific to this app:

Getting started with development

Prerequisites

The app is a fairly standard Ruby on Rails app. Its primary dependencies include:

Setting up each of these is beyond the scope of this readme, but if you don't know where to begin, I'd recommend the official Getting Started with Rails guide. A future improvement to this repo could be to include a Docker container to help with environment setup

Install Ruby/JavaScript dependencies and initialize database

Once the prerequisite tools are all configured, run the following commands from the project's root directory:

bundle
bundle exec rake db:setup
yarn

Declare user agent with the SEC

Per the SEC Webmaster FAQ, you need to declare your user agent:

User-Agent: Sample Company Name AdminContact@<sample company domain>.com

The app looks for an environment variable called SEC_USER_AGENT, you can set it in development by creating a .env file in the project root and adding SEC_USER_AGENT="Sample Company Name AdminContact@<sample company domain>.com", substituting your own name/email

Database schema

There are three main tables:

  1. thirteen_fs - one row for each filing. Roughly corresponds to a filing's "primary doc" XML available on the SEC's website
  2. holdings - each thirteen_f record has many holdings. One holding corresponds to a row in the "information table" XML
  3. aggregate_holdings - a denormalized version of holdings which aggregates across the other_manager and investment_discretion columns. In practice it seems like most of the time it's more interesting to look at aggregate_holdings instead of holdings, but the app keeps both around. aggregate_holdings could be a view instead of a table, but I found that the indexed table helped significantly with query performance

There are a few materialized views that are calculated from the above tables and used to determine "canonical" names for each manager and CUSIP, see the db/views/ folder for more

Populate database with 13F data

There are a few ways to populate data. The simplest is to use the provided MinimalDbSeeder class, which will import and process recent filings from a handful of investment managers

bundle exec rake filings:seed_minimal_db

You can change the default managers and/or time periods either by editing minimal_db_seeder.rb, or by specifying options in the Rails console:

# look up manager CIKs at https://www.sec.gov/edgar/searchedgar/cik.htm
my_ciks = ["CIK1", "CIK2"]
filing_periods = [{year: 2018, quarter: 1}, {year: 2018, quarter: 2}]
MinimalDbSeeder.new(ciks: my_ciks, periods: filing_periods).seed_minimal_db!

The minimal db seeder is intended as a quick and easy way to get your database into a useful state for development purposes, but if you want to import all filings from a given quarter, you can use the following method from within the Rails console:

ThirteenF.import_filings!(filing_year: 2021, filing_quarter: 1)

There's also a rake task available to import all filings from all quarters from Q1 2014 through present:

bundle exec rake filings:import_all

The ThirteenF.import_filings! method will create one placeholder row in the thirteen_fs table for each filing on the SEC's website, but it will not fetch the data for each filing. In order to fetch and process the data into the holdings and aggregate_holdings tables, you need to call thirteen_f.process! on each record, which:

  1. Fetches the primary doc and info table XML files from the SEC's website
  2. Stores them in the relevant primary_doc_xml and info_table_xml columns in the thirteen_fs table
  3. Inserts the appropriate rows into the holdings and aggregate_holdings tables

The ThirteenF.cache_data_and_create_holdings_for_unprocessed method will queue up asynchronous delayed jobs to process whatever unprocessed records are in your thirteen_fs table. You can work off those jobs by running a delayed job worker from the project root:

bundle exec rake jobs:work

Processing seems to average about 1.5 records per second, and as of March 2021 there are ~140,000 records, so it might take over a day to process all of them. Note that the SEC's website has rate limits in place so I would not recommend running more than 2 workers at a time

Running a development server

The app uses the Webpacker gem, I find that the best development experience is to run the Rails server and Webpack dev server in separate terminal windows:

rails server
./bin/webpack-dev-server

Keeping the database updated in production

You can run one clock and (at least) one worker process to keep the database up to date as new filings come in. There's also the clockandworker process, which can run on a single Heroku dyno. See the Procfile for usage

Other development notes

The app uses the Tailwind CSS framework. If you've never used Tailwind before, the short version is that you generally don't write CSS, instead you apply preexisting classes to your HTML templates. Special thanks to Edwin Morris for helping me get set up with Tailwind

The tables are built with DataTables, in most cases using AJAX data sources. Most of the relevant logic lives in the DataController and DataTableFormatter classes

There is no logged in experience, which makes it easier to use edge caching via public Cache-Control headers

Ideas for future improvements

Questions/issues/contact

todd@toddwschneider.com, or open a GitHub issue