current12 / Stat-222-Project

3 stars 0 forks source link

Code Review for Data Loading and Cleaning #26

Closed ijyliu closed 6 months ago

ijyliu commented 6 months ago

Data Pipeline Steps

  1. Code/Data Loading and Cleaning/Combine Credit Rating Data.ipynb, Code/Earning Calls/calls2sec.ipynb, Code/Data Loading and Cleaning/tabular_findata_retrival&loading.ipynb
  2. Code/Data Loading and Cleaning/Credit Ratings on Earnings Call Dates.ipynb
  3. Code/Data Loading and Cleaning/Create Combined All Data.ipynb
ijyliu commented 6 months ago

@OwenLin2001 @current12 I suggest reviewing 2 and 3 linked above in the Data Loading and Cleaning folder

ijyliu commented 6 months ago

this stuff is also in the readme on the main page: https://github.com/current12/Stat-222-Project

along with stuff about filepaths and a conda environment

ijyliu commented 6 months ago

issue on hold pending data restructure

ijyliu commented 6 months ago

Here's new code to review that restructures the data to a format where each row is a fixed quarter date (1/1, 4/1, etc.) for a company and year, and the most recent earnings call and financial statement data as of that date are attached:

  1. https://github.com/current12/Stat-222-Project/blob/restructure-to-fixed-quarter/Code/Data%20Loading%20and%20Cleaning/Credit%20Ratings%20on%20Fixed%20Quarter%20Dates%20with%20Earnings%20Call%20Date%20for%20Linkup.ipynb

  2. https://github.com/current12/Stat-222-Project/blob/restructure-to-fixed-quarter/Code/Data%20Loading%20and%20Cleaning/Create%20Combined%20All%20Data%20-%20Fixed%20Quarter%20Dates.ipynb

I know the code/output might be a little cut off but there's a download button in the top right, you can just download a copy of the notebook.

This is done on a separate branch. Let me know and i'll merge it into the main one. The output data files are new files.

ijyliu commented 6 months ago

code was merged to main.

closing but obviously in the future feel free to reopen and send notes/comments