Open Yiannis20 opened 5 years ago
The Synthetic data generation platform developed at the ONS Data Science Campus was applied to the NEED dataset.
Several preprocessing steps were carried out, including handling of categorical variables, removal of redundant variables and scaling.
Autoencoders were then used to generate synthetic data. The quality assessment of the generated data was carried out with the correlation structure comparison method. Satisfactory results in terms of statistical reconstruction accuracy were obtained after several rounds of optimisation. The results are shown in Figures 1 and 2.
Figure 1: Correlation structure for the real data of the NEED dataset.
Figure 2: Correlation structure for the synthetic data generated with the Data Science Campus platform for the NEED dataset.
Following the data exploration phase, the Synthetic data generation platform developed at the ONS Data Science Campus will be applied to the NEED dataset.