Data Quirks - Githubissues

kenneth-rios / mixed-panel-logit-default-risk

Forecasting Sovereign External Debt Default via Mixed Panel Logit Simulation

0 stars 0 forks source link

Data Quirks #2

Open kenneth-rios opened 5 years ago

kenneth-rios commented 5 years ago

1). Remove roughly six defaults from LHS data that Cruces & Trebesch claim were "domestic debt" restructurings.

2). For the rest of the defaults that Cruces & Trebesch doesn't include for whatever reason (outside of being "domestic"), we can simply drop those when we do our curve-fitting of predicted unconditional probabilities on average haircut (weighted by relative debt restructured).

kenneth-rios commented 5 years ago

3). Before reshaping EIU data, try to find an "optimal" continuous range of years for each country that minimizes the presence of NAs. Note that we do not need a balanced panel! We should be aware that we would very much like to have years in which poorer countries defaulted, as those are less likely to be available ex-ante. For any "holes" in the data within the range of years generated, we can simply impute using a moving-average model, for instance.

Basically we want as large a subset of contiguous data possibly by year and by country, for select variables - we need enough data to estimate all the random coefficients in the mixed logit: $n$ coefficients for each variable in $\mathbf{\beta}$, if $n$ is the total number of countries in our dataset.

kenneth-rios commented 5 years ago

We should use all data up to the year previous to the year we wish to predict unconditional probabilities for since that is what credit rating agencies do when they assign sovereign ratings at the beginning of the year they wish to predict for (the very end of the previous year also works).

In other words: to predict 2015 probabilities, we use all data up to year 2014; to predict 2016 probabilities, we use all data up to year 2015; to predict 2017 probabilities, we use all data up to 2016 and run three separate mixed panel logit models. Then compare those probabilities with the sovereign ratings released by Fitch/Moody's/S&P at the beginning of those years (or again, the very end of the previous year).

If our results aren't robust to single years of test data, then we can try pooling across the three years of test data and then plot an ROC curve. ~But I believe we should have good results for the single-year analysis (at least compared to the credit rating agencies!).~

kenneth-rios commented 5 years ago

Data summary from @shukritg Mixed_logit_data_summary.docx

Final dataset so far has 838 observations over 21 one-year lagged variables, spanning 59 countries across 19 years over those country panels.

I think 2010 makes a good cutoff year for prediction based on aggregate default histories: Pre-2010 -> training Post-2010 (unit 2017) -> test BUT WE MUST IMPUTE DATA FOR GREECE VARIABLES POST-2007!

Then we compare model's performance on the test data using logits with that of credit rating agencies late-2009 long term foreign bond ratings. We can compare our classification rate using signal/noise maximization or Type II error minimization and compare to credit rating agencies accuracy ratio using ROC analysis.

Possibly four countries that are misnamed and causing mismatches in the merge. We will inspect. I still have to remove the six defaults viz a viz Cruces & Trebesch.