tidy-finance / website

This repository hosts the source code for the website tidy-finance.org
https://tidy-finance.org
Other
86 stars 49 forks source link

blog/tidy-finance-dummy-data #66

Closed christophscheuch closed 1 year ago

christophscheuch commented 1 year ago

Synthetic data is way out of scope because it is non-trivial to create synthetic panel data. I also deliberately avoided the term "simualated" data because it implies some meaningful structure according to an economic model (at least to me). I hence decided to create "dumy" data because it is really just some random data that allows users to at least run the code, but not replicate the results.

I know it would be great if we had actually meaningful data that we pull from some other source than CRSP, but I believe that it is either hard from a legal perspective (e.g. by tapping simfin) or hard from an effort perspective (e.g. extracting the information from raw data). I also don't dare asking ChatGPT for simulated data because who knows whether it actually steals the data from somewhere.

Please let me know whether you agree with the direction and whether I should continue writing some text around the code chunks.