tidy-finance / website

This repository hosts the source code for the website tidy-finance.org
https://tidy-finance.org
Other
82 stars 47 forks source link

Reconsider industry selection in CRSP v2 #102

Closed patrick-weiss closed 3 months ago

patrick-weiss commented 4 months ago

We currently use the siccd from msf_db (i.e., in_schema("crsp", "msf_v2")). However, the more accurate way is siccd from stksecurityinfohist_db (i.e., in_schema("crsp", "stksecurityinfohist")). Yes, this variable has the same name but a different meaning depending on the table.

Reasoning: Back-fills in msf_v2's siccd. A simple check to reveal this back-fill (with an active WRDS connection):

siccd_permno_pairs <- msf_db |> select(permno, siccd) |> collect() |> distinct() nrow(siccd_permno_pairs) == length(unique(siccd_permno_pairs$permno))

This shows that every permno only has a single industry, which implies that no firm ever changed the industry - but they do: contradiction

Relevant for r-tidyfinance

patrick-weiss commented 4 months ago

Additional CRSP v2 implementation issues:

  1. We can do better when filtering exchanges. In particular, the suggestion is to include:

primaryexch in ('N','A', 'Q') and conditionaltype = 'RW' and TradingStatusFlg = 'A'

  1. We could consider highlighting that there is no longer a cumulative adjustment factor for prices and shares outstanding (but only an adjustment factor for the day of the change).

First is more relevant, second is just for information. Any thoughts on whether we should implement them?

christophscheuch commented 4 months ago

Ad 1: I'm in principle in favor of the change, but suggestion from where? We should also add info what those filters mean in human language.

As 2: for whom is 2 relevant? For us, it isn't, is it?

patrick-weiss commented 4 months ago

ad 1. Based on the description by WRDS on how the old exchcd maps into primaryexch. We can consider conditionaltype in ('RW', 'NW'), where RW is regular trading and NW for when issued (as we also use exchcd 31, 32, and 33). The TradingStatusFlg is necessary to avoid prices when execution is halted. --> https://wrds-www.wharton.upenn.edu/pages/support/manuals-and-overviews/crsp/stocks-and-indices/crsp-stock-and-indexes-version-2/crsp-ciz-faq/

ad 2. It is not relevant in the book, but there are many situations where you would need it (anything taking prices as an input without multiplying by shares).