current12 / Stat-222-Project

3 stars 0 forks source link

Backfill Credit Ratings All The Way To Q1 2010? #30

Closed ijyliu closed 6 months ago

ijyliu commented 6 months ago

Libor mentioned potentially backfilling credit ratings to 2010/the start of the data for all companies. We'd create a synthetic rating which is the earliest rating for which we currently have data, and assign it a date of 1/1/2010 (or actually 4/6/2010-ish since that's when our data actually starts).

This would fix a quirk of the sample we seem to have - companies enter in a very staggered fashion, and the number of companies we have is increasing over time.

image

(from credit ratings eda)

But I'm not sure if this is a good approach. It really seems to be stretching assumptions about the credit rating data. It'd be believing that a ton of companies just didn't get rating issuances at all in 2010, etc - when I think the more plausible explanation is that our data is just incomplete/not very good for those years. My perception of the data's incompleteness is informed by its sources - a lot of webscraping that might have only started in later years. Backfilling would also be wrong for companies that actually only started existing/only started issuing bonds after 2010.

What do you think about backfilling? I'm inclined to not do it.

ijyliu commented 6 months ago

I also think that when he said this he was maybe still under the impression our raw data was changes in ratings rather than what we actually have, which is both changes in ratings and reaffirmations (issuances).

We just don't have data for 2010 and early years for most companies. If we had change data, then maybe you could interpret that as no change, but since we have both changes and reaffirmations at the times we do have, that's a hard case to make.

seanzhou1207 commented 6 months ago

Yeah I think just the multi-class classification makes sense to me too. We don’t necessarily need to follow his advice.

On Feb 28, 2024, at 9:30 PM, Isaac Liu @.***> wrote:

I also think that when he said this he was maybe still under the impression our raw data was changes in ratings rather than what we actually have, which is both changes in ratings and reaffirmations (issuances).

We just don't have data for 2010 and early years for most companies. If we had change data, then maybe you could interpret that as no change, but since we have both changes and reaffirmations, that's a hard case to make.

— Reply to this email directly, view it on GitHub https://github.com/current12/Stat-222-Project/issues/30#issuecomment-1970435386, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARWWYQE5BGXB5W4SJJTCVEDYWAG7FAVCNFSM6AAAAABD7HQLIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZQGQZTKMZYGY. You are receiving this because you were assigned.

OwenLin2001 commented 6 months ago

Need some more details on how we craft new columns (change in credit rating). And in the case of classifying rating, how to impute the credit rating.

ijyliu commented 6 months ago

Check out the new code I put in #26

On Wed, Feb 28, 2024, 11:43 PM OwenLin2001 @.***> wrote:

Need some more details on how we craft new columns (change in credit rating). And in the case of classifying rating, how to impute the credit rating.

— Reply to this email directly, view it on GitHub https://github.com/current12/Stat-222-Project/issues/30#issuecomment-1970581364, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCGE4OCVCGRKYR6MNGRBLDYWAWSHAVCNFSM6AAAAABD7HQLIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZQGU4DCMZWGQ . You are receiving this because you were assigned.Message ID: @.***>

ijyliu commented 6 months ago

decided not to backfill