hodsonjames / employment

Repository for code related to research projects on global employment dynamics.
MIT License
2 stars 11 forks source link

Historical Data Coverage #17

Open AnastassiaFedyk opened 5 years ago

AnastassiaFedyk commented 5 years ago

Hi Honghao,

We would like to investigate how good the data coverage actually is (what percentage of individuals employed at a certain firm we capture), and how that has changed over time. To this end, could you please do the following?

  1. Out of all the firms that we have looked into so far, take those that are publicly traded.
  2. Download the historical numbers of employees (let's say going back to 1990) from Compustat, which you can access through WRDS. By the way, do you have access to WRDS through Northwestern?
  3. Compare the number of employees that we see in our data each year against the numbers Compustat. What fraction of employees do we capture for each firm? How does that fraction change over time?

Please let me know if you have any questions.

Thank you, Anastassia

c-forrest commented 5 years ago

Hi Professor Fedyk,

Please find the comparison here. It seems that our resume data only cover a half or less of the total employees in the 15 public companies. The coverage usually grows slowly over the years. One significant difference between the two sources is that Compustat data suffer a lot from the M&As and often changes suddenly, while the resume data always grows smoothly.

Please let me know if you have any question on this.

Best, Honghao

hodsonjames commented 5 years ago

Thanks Honghao!

I think coverage is sufficient, though obviously not complete. For larger companies, I would generally be happy with anything above a 10% sample of the employees if it is well distributed across positions/hierarchy.