Risk Score Card Develope

abess-team / abess

Fast Best-Subset Selection Library

https://abess.readthedocs.io/

Other

474 stars 41 forks source link

Risk Score Card Develope #514

Closed cfkstat closed 3 months ago

cfkstat commented 1 year ago

Maximize the AUC Score of the model training set and validation set, while ensuring that the difference between the two AUCs is less than 0.02, or the difference between KS indicators is less than 3%. It should be noted that the training set and validation set are split across time, such as the loan month.

Mamba413 commented 1 year ago

Thanks for this comment. But I do not fully understand this, does you mean splitting samples across time like sklearn.model_selection.TimeSeriesSplit?

cfkstat commented 1 year ago

It's similar, but not exactly the same. For example, to develop loan application score, I use loan credit 202204 to 202208 as the training set, and 202209 to 202210 as the valid set. It is necessary to optimize the AUC of the training set and the valid set, and it cannot be overfitted. The gap between the training and valid AUC is less than or equal to 2%, and the gap between KS is less than or equal to 3%.

Mamba413 commented 1 year ago

Based on my understanding, the difference from sklearn.model_selection.TimeSeriesSplit is that you want to control the AUC of the validation set and training set within a certain range (e.g., 2%), is that correct?