[x] number of companies, distribution of calls per company
[x] distribution of call dates, distribution of call year + quarter
[x] compare year + quarter versus call date. construct variable for gap between last day of year + quarter and call date and check distribution. histogram of the gap, summary statistics of the gap, number of cases with call date before last day of year + quarter
[x] check for duplicates on call date by company, and on year + quarter by company
[x] pick a few companies and see if transcript agrees with year, quarter, call date
[x] merge a dataset of year, quarter, call date on year and quarter with the old earnings call data and check the difference in call date in old v. new data. check the difference in call date and last day of year + quarter in old v. new data
[ ] any other checks that enable us to be confident in the data quality
where possible use same code for old and new
I'd make a new notebook, we can delete the old one if the data looks good.
EDA inherited from #7.
where possible use same code for old and new
I'd make a new notebook, we can delete the old one if the data looks good.