[x] Update your NN so that you have latent space. I think one hidden layer between x,s and z and one hidden layer between z and y are enough
[x] Update the loss function so that it has three terms: accuracy (to minimize since our model replicate decision making based on both sensitive and permissible attributes), fairness (to maximize, only permissible attributes are used to make the decisions more fair), conservatism (to minimize, the temporary name for this term, it denotes how far we are from the original biased decisions)
[x] Control if you need to add some constant K into your model when you make fair decisions as shown in the notes in the comments
[ ] Plot the results. On the x axis we have impact of conservatism alpha, see green variable in the notes. alpha = 0 denotes the lack of conservatism, alpha = 1 denotes the equal importance of accuracy and conservatism. On the y axis there are two values of interest: conservatism term and fairness.
[ ] Add the third plot, that is the dependence of mutual information (see the section to read) and alpha
[ ] Calculate the conservatism and fairness of the "benchmark": the NN where the input is permissible attributes only. This is to check if we improve results with our idea in comparison with the simplest approach to ignore sensitive attributes. You can add the results as a vertical lines to the figure or as a point to the plot of the dependence between conservatism and fairness.
[ ] Upload the obtained results (basically, the figure) to overleaf. If the results are fine, add the description of the results and the approach.
[ ] Check the literature to read, optionally add it to the thesis.