alan-turing-institute / QUIPP-collab

Collaboration on the QUIPP project
1 stars 1 forks source link

FCA mortgage data #138

Open gmingas opened 4 years ago

gmingas commented 4 years ago

These are data on the universe of outstanding UK regulated mortgages collected by the UK Financial Conduct Authority (FCA). They are tabular structured data. Starting from June 2015, this dataset is updated every six months and contains information on loan characteristics (e.g. original and current balance, type of mortgage, current interest rate), performance (e.g. whether the loan was repaid or entered arrears) and some attributes of the person taking the loan (e.g. employment status, gross income, age, region). The data are not publicly accessible. They do not contain personal identifiers. Description can be found here in page 13.

We are currently discussing a possible collaboration with a researchers from Turing and QMUL (main researcher's name is Saumitra Mishra) who work on explainability of machine learning models trained on this data. They do not have direct access to the data and are seeking ways to create synthetic versions in order to facilitate their analysis. This might involve access in a safe haven secure environment for a named non-QUIPP researcher who will be able to use the QUIPP code but this depends on FCA approval. The alternative is to provide the code to FCA data providers who will run the synthetic experiments.

We had a meeting on the 27th of October and waiting on feedback from FCA at the moment.